Skip to content

Outcome Context — Query Layer Reference

Date: 2026-05-04 • Status: Living reference, refined as integration progresses.

This document is the orientation point for the Outcome Context query layer and the storage layer it sits on top of. It describes what the system is, what each side owns, and how the two halves dock. It assumes the four convergence decisions D1–D4 from the gap analysis — read that doc for the diff history and rationale; this doc describes the system after those decisions land.

A working overview. This document is the practical reference for the shipped query layer; for the diff history and the rationale behind the D1–D4 decisions, see the gap analysis.

Decisions are not one-way doors. Treat the contract as load-bearing for now, not permanent. If a real constraint surfaces during integration, update the spec and this doc.

1. The system at a glance

The Outcome Context query layer is a context-first GraphQL surface that evaluates expressions over per-world simulation output stored in ClickHouse.

┌────────────────────────┐    ┌──────────────────────────────────────────┐    ┌──────────────────┐
│ Simulation pipeline    │    │ Query layer                              │    │ Consumers        │
│ (existing — separate)  │    │ ──────────                               │    │ (UIs, services)  │
│                        │    │ Outer ring   GraphQL surface             │    │                  │
│  Marten + ClickHouse   │ →  │ Middle ring  rules, expansion, postfix   │ →  │ GraphQL queries  │
│  writes Outcome        │    │ Inner ring   stack evaluator over worlds │    │                  │
│  Context rows          │    │                                          │    │                  │
└────────────────────────┘    │ reads via IOutcomeContextStore           │    └──────────────────┘
                              └──────────────────────────────────────────┘

The simulation runs once. It produces an Outcome Context — a set of rows, each holding one outcome's per-world values, plus a catalog of which outcomes exist for the run's fixtures and rosters. The query layer reads those rows and applies arbitrary expressions over them. No simulation runs at query time.

What lives where.

Layer Owned by Responsibility
Simulation pipeline (existing — out of scope for this doc) Runs the sim, produces OutcomeRow data, populates the pre-simulation catalog.
Storage contracts lbs.foundry/src/OutcomeContext/LBS.OutcomeContext.Contracts/ Sport-agnostic data types: OutcomeRow, OutcomeDefinition, GameOutcomeContext, SeasonOutcomeContext, OutcomeIdParser, TimePeriodConstants.
Storage (sport binding) lbs.foundry/src/Models/AmericanFootball/LBS.Model.AmericanFootball.* Persists Outcome Context rows, the outcome template catalog, and the pre-simulation catalog. Owns the American Football catalogue + accumulators. Exposes IOutcomeContextStore for reads.
Query layer lbs.foundry/src/OutcomeContext/LBS.OutcomeContext.Query/ (engine; prototype is the historical reference) Sport-agnostic GraphQL surface, hard-rule enforcement, expression canonicalisation, postfix evaluation.

2. The Outcome Context model

2.1 What an Outcome Context is

A scope (one game, one season) plus the outcomes that were measured for it across every world the sim produced.

Concept Meaning
World One run of the simulation for the scope. Worlds 0..N−1 are positionally aligned within a context.
Outcome A named, measured quantity (PASSING_YARDS_HALF1_mahomes_patrick_1). Each outcome has one value per world.
OutcomeRow One outcome's per-world values for one context. Values[i] is the outcome's value in world i.
Scope A single game (game_id) or a single season (season_id). Each scope has one Outcome Context per simulation run.

2.2 The two-tier model — raw stored, derived expanded

The catalog has two kinds of outcomes. Storage stores raw outcomes only. Booleans and other derived outcomes are expressions, expanded by the query layer at query time.

Tier What it is Where it lives Examples
Raw outcome A measured per-world value the simulation produced OutcomeRow rows in storage. valueType ∈ {Numeric, Ordinal, Temporal} TOTAL_TDS_GAME_mahomes_patrick_1 (Numeric), SCORING_RANK_GAME_kelce_travis_1 (Ordinal), TIME_FIRST_TD_GAME_mahomes_patrick_1 (Temporal)
Derived outcome A named expression over raw outcomes Catalog OutcomeDefinition.Definition (postfix template). No rows. valueType: Boolean (almost always) ANYTIME_TD_GAME_{p} defined as TOTAL_TDS_GAME_{p} 1 GTE

Why this split: storage is denser (booleans collapse into the underlying counter), the cache key is natural (the expanded canonical form is the cache key), and the expression model stays uniform — every leaf at evaluation time is a raw outcome.

2.3 Worlds and value semantics

Value type Meaning NaN rule
Numeric A count, a distance, a percentage. Arithmetic is meaningful. NaN for "no value this world" (rare for numerics).
Ordinal A rank within the world (1st, 2nd…). Comparison is meaningful; arithmetic is not. NaN if the participant didn't appear in the ranking that world.
Temporal A time within the world (e.g. seconds-from-kickoff of first TD). Comparison is meaningful. NaN if the event didn't occur that world. Excluding NaN worlds from validWorlds is the convention.
Boolean A truthy/falsy per world. Always derived. Comparison and logical ops produce booleans; raw booleans don't appear in storage.

All rows in the same context have the same Values array length. That length is the context's worldCount.

3. The contract surface

This is what storage exposes to the query layer. The .NET types live in LBS.OutcomeContext.Contracts (sport-agnostic, referenced by both the storage write path in LBS.Model.AmericanFootball.Accumulation and the query read path in LBS.OutcomeContext.Query).

3.1 Data types

public record OutcomeRow(string OutcomeId, int[] Values);

public record OutcomeDefinition(
    string  OutcomeId,        // concrete id OR template id (see §4)
    string  Category,
    string  ValueType,        // Numeric | Boolean | Ordinal | Temporal
    string? Definition);      // postfix template; null/empty for raw

public class GameOutcomeContext
{
    public required string GameId         { get; init; }
    public required string SeasonId       { get; init; }
    public required int    ContextVersion { get; init; }
    public required IReadOnlyList<OutcomeRow> Rows { get; init; }
}

public class SeasonOutcomeContext
{
    public required string SeasonId       { get; init; }
    public required int    ContextVersion { get; init; }
    public required IReadOnlyList<OutcomeRow> Rows { get; init; }
}

Notes: - OutcomeRow.Values[i] is the outcome in world i. All rows in a context have the same array length. - OutcomeDefinition.Definition is a postfix template string for derived outcomes (D2). null or empty string means the outcome is raw and has rows in the Outcome Context. - GameOutcomeContext.SeasonId is what links a game to its parent season for joint queries. - ContextVersion is monotonic per scope; storage uses ReplacingMergeTree(context_version) so re-runs at the same version overwrite. After D1, ContextVersion is no longer used to enforce alignment across contexts.

3.2 The query layer's view — IOutcomeContext

The evaluator consumes a single abstraction over both context types:

public interface IOutcomeContext
{
    string   ScopeType      { get; }   // "GAME" or "SEASON"
    string   ScopeId        { get; }   // GameId or SeasonId
    string   WorldSetRef    { get; }   // informational — "{SeasonId}#{ContextVersion}"
    int      ContextVersion { get; }
    int      WorldCount     { get; }   // = Rows[0].Values.Length
    bool     HasOutcome(string outcomeId);
    int[] Values(string outcomeId);
    IEnumerable<string> OutcomeIds { get; }
}

The concrete GameOutcomeContext / SeasonOutcomeContext types implement this. ScopeId, ScopeType, WorldSetRef, and WorldCount are computed adapter properties — they are not stored fields.

3.3 The read interface — IOutcomeContextStore (D4)

Storage exposes a typed read interface. The query layer depends on the abstraction; it never sees ClickHouse.

public interface IOutcomeContextStore
{
    Task<IOutcomeContext?> GetByScopeIdAsync(
        string scopeId,
        int? contextVersion = null,
        CancellationToken ct = default);

    Task<IReadOnlyList<IOutcomeContext>> GetManyByScopeIdAsync(
        IReadOnlyList<string> scopeIds,
        int? contextVersion = null,
        CancellationToken ct = default);

    Task<IOutcomeTemplateCatalog> GetTemplateCatalogAsync(
        CancellationToken ct = default);

    Task<IPreSimulationCatalog> GetPreSimulationCatalogAsync(
        string runId,
        CancellationToken ct = default);
}

Method shape is provisional and will firm up during the storage-layer rework. Three patterns are non-negotiable:

  1. Single-context fetch by scopeId. A gameContext(gameId) query loads one context.
  2. Multi-context batch fetch. A seasonContext(seasonId, gameIds:) query loads 1 + N contexts in one round-trip — N=16 in NFL — not N+1 sequential reads.
  3. Catalog access. Both the outcome template catalog and the pre-simulation catalog (for a run) are reachable through the same interface.

What's not yet decided: how contextVersion resolution works when omitted (latest? latest-before-cutoff?), whether projection — fetching values for a subset of outcome_ids only — is a first-class parameter or handled inside the implementation, and whether a single store interface serves both the contexts and the catalogues or they split.

3.4 The ClickHouse layer beneath the interface

Storage today is two ClickHouse tables (validated end-to-end at 100K worlds on Production 3×16 — see status-summary.md).

CREATE TABLE game_outcome_context
(
    game_id          LowCardinality(String),
    season_id        LowCardinality(String),
    context_version  UInt32,
    outcome_id       LowCardinality(String),
    values           Array(Int32) CODEC(Delta, ZSTD)
)
ENGINE = ReplacingMergeTree(context_version)
ORDER BY (game_id, outcome_id)
PARTITION BY season_id;

CREATE TABLE season_outcome_context
(
    season_id        LowCardinality(String),
    context_version  UInt32,
    outcome_id       LowCardinality(String),
    values           Array(Int32) CODEC(Delta, ZSTD)
)
ENGINE = ReplacingMergeTree(context_version)
ORDER BY (season_id, outcome_id);

One row per (scope, outcome_id). The values array has one Int32 entry per world (outcome values are rounded to ints for storage); index = world index. Schema review and tuning history is documented in clickhouse-schema-review.md.

The query layer never executes SQL. The IOutcomeContextStore implementation is the only thing inside this boundary.

4. The two catalogues

D3 splits the storage-side IOutcomeCatalogue into two artefacts.

4.1 Outcome template catalog (storage)

Roster-agnostic. Lists outcome templates with metadata describing each placeholder.

public record OutcomeTemplate(
    string  OutcomeIdTemplate,    // e.g. "ANYTIME_TD_GAME_{participantId}"
    string  Category,
    string  ValueType,
    string? Definition,           // postfix template; null/empty for raw
    IReadOnlyList<TemplateSlot> Slots);

public record TemplateSlot(
    string Name,                  // e.g. "participantId"
    string RoleType);             // "participant" | "team" | "season" | "opposingParticipant"

Templates are stable across runs and small in count (tens, not thousands). The format of Definition is postfix RPN with {slotName} placeholders. Authoring may be in infix; a build step canonicalises to postfix before populating storage.

Examples: - TOTAL_TDS_GAME_{participantId} — raw, Definition: null, Numeric. - ANYTIME_TD_GAME_{participantId} — derived, Definition: "TOTAL_TDS_GAME_{participantId} 1 GTE", Boolean. - TOP_SCORER_SEASON_{participantId} — derived, Definition: "TOTAL_TDS_SEASON_{participantId} TOTAL_TDS_SEASON_{otherParticipantId} GT", Boolean. Two slots (participantId, otherParticipantId).

4.2 Pre-simulation catalog (storage)

Run-scoped. The fixtures, teams, and rosters that the simulation operates on. Established at sim start and frozen for the run. Every OutcomeContext produced by the run references this catalog implicitly.

A single document (per run, per sport) describes: - The set of teams in the run. - The roster on each team — players with stable identifiers. - The schedule of fixtures (game id, home, away, season, week).

The query layer reaches into this catalog when it needs to materialise concrete outcome IDs from templates — e.g. "give me all outcomes for kc_buf_w1" combines the template catalog with KC's and BUF's rosters from the pre-sim catalog for the run.

Open sub-design. Canonical name (SimulationCatalog? RunManifest? FixtureCatalog?), persistence target (Marten document store? ClickHouse? flat JSON per run?), and identification scheme are unsettled. The gap analysis lists the open questions; pick one and document the choice during Phase 3 code work.

4.3 Concrete IDs from templates × pre-sim catalog

When a consumer references ANYTIME_TD_GAME_mahomes_patrick_1 against gameContext(gameId: "kc_buf_w1"):

  1. The middle ring loads the game context, the template catalog, and the run's pre-sim catalog.
  2. It looks up the template ANYTIME_TD_GAME_{participantId}, sees the Definition is TOTAL_TDS_GAME_{participantId} 1 GTE, the slot {participantId} has RoleType: "participant".
  3. It substitutes {participantId} with mahomes_patrick_1 (validated against the pre-sim catalog: yes, that player is on KC's roster).
  4. Expansion produces TOTAL_TDS_GAME_mahomes_patrick_1 1 GTE — a postfix expression with raw leaves only.
  5. The evaluator runs.

A consumer that only knows about templates discovers them through GraphQL outcomeDefinitions(filter). A consumer that wants concrete IDs for a specific run uses outcomes(filter) on the loaded context.

5. Hard rules

The query layer enforces four rules. Each violation produces a specific error code so consumers can branch on them.

# Rule Trigger Error code
1 Canonical wire format is infix ExpressionInput. Schema validation. (caught by GraphQL schema — no resolver-level code)
2 Cross-context worldCount must match. Server materialises each scope and compares array lengths. CONTEXT_SIZE_MISMATCH
4 Every leaf must resolve to a known outcome. Resolution at expansion time and at evaluator entry. UNKNOWN_OUTCOME_ID
5 Malformed expression beyond schema. Structural check in InfixToPostfix.Walk and stack-count check in evaluator. MALFORMED_EXPRESSION
Unknown scope scopeId doesn't exist in storage. UNKNOWN_SCOPE
Type mismatch Type checker rejects (e.g. logical op on Numeric operand) TYPE_MISMATCH

Rule 3 (CONTEXT_VERSION_MISMATCH) is dropped under D1. worldSetRef is computed from (SeasonId, ContextVersion) and exposed in responses, but the values are no longer required to match across contexts in a multi-scope query. The orchestrator is the source of correlation. Rule 2 catches the most common shape of misalignment (different run sizes); orchestrator discipline catches the rest.

A single malformed or unresolvable expression rejects the whole batch. No partial-success semantics in Wave 1 — revisit if a product driver appears.

6. The expression language

6.1 The wire format — typed infix tree

Consumers submit expressions as ExpressionInput, a discriminated union over four cases:

input ExpressionInput {
  outcome:  OutcomeRefInput   # leaf — references an outcome_id
  constant: Float             # leaf — numeric literal
  binary:   BinaryExprInput   # node with two operands
  unary:    UnaryExprInput    # node with one operand
}

input BinaryExprInput { left: ExpressionInput!, op: BinaryOp!, right: ExpressionInput! }
input UnaryExprInput  { op: UnaryOp!, operand: ExpressionInput! }

input OutcomeRefInput {
  context:       ContextType   # GAME | SEASON — optional, inherits
  scopeId:       ID            # gameId/seasonId — optional, inherits
  type:          OutcomeType!  # closed enum (catalog vocabulary)
  timePeriod:    TimePeriod!   # closed enum (GAME, HALF1..Q4, OT, SEASON)
  participantId: ID!
}

enum BinaryOp {
  ADD SUBTRACT MULTIPLY DIVIDE MODULO
  EQUAL NOT_EQUAL LESS_THAN GREATER_THAN
  LESS_OR_EQUAL GREATER_OR_EQUAL
  AND OR XOR
}

enum UnaryOp { NOT ABSOLUTE TO_INT }

Consumers never write postfix. The middle ring translates every ExpressionInput to postfix tokens via InfixToPostfix.Translate.

Scope inheritance: context and scopeId are optional under gameContext and under seasonContext-without-games; the leaf inherits from the enclosing query context. Under seasonContext-with-games (multi-scope), every leaf must specify its scope explicitly.

6.2 The canonical form — postfix

Internally the engine works in postfix. The canonical postfix string is also the public-facing identity of an expression — it appears as id on every result, and its sha256 is the cache / dedup key.

Canonicalisation does three things: 1. Materialises inherited scope on every leaf. 2. Sorts commutative operands into a stable order. 3. Expands derived outcomes — replaces leaves that reference templates with Definition: !=null by their definition's postfix, with slot substitution.

The result of canonicalisation is the postfix the evaluator runs.

6.3 Worked example

Consumer query:

{
  gameContext(gameId: "kc_buf_w1") {
    evaluate(expressions: [{
      name: "mahomes_2plus_tds",
      expression: {
        outcome: { type: TWO_PLUS_TDS, timePeriod: GAME, participantId: "mahomes_patrick_1" }
      }
    }]) {
      id
      display
      probability
      matchingWorlds
      totalWorlds
    }
  }
}

Middle-ring processing: 1. Materialise. Load kc_buf_w1 via IOutcomeContextStore.GetByScopeIdAsync. Read its worldCount and worldSetRef. 2. Resolve scope. Leaf has no scopeId; inherit kc_buf_w1 from the enclosing gameContext. 3. Canonicalise. Leaf is TWO_PLUS_TDS_GAME_mahomes_patrick_1@kc_buf_w1. Template lookup finds TWO_PLUS_TDS_GAME_{participantId} with Definition = "TOTAL_TDS_GAME_{participantId} 2 GTE". Substitute → TOTAL_TDS_GAME_mahomes_patrick_1@kc_buf_w1 2 GTE. 4. Type-check. Boolean root op (GTE) on Numeric operand against Float constant → valid. 5. Translate to postfix tokens. Already in postfix from canonicalisation. 6. Evaluate. Inner ring loads the TOTAL_TDS_GAME_mahomes_patrick_1 row's values, walks the postfix stack per world, produces a per-world boolean. 7. Summarise. matchingWorlds = popcount, totalWorlds = worldCount, probability = matching / total.

Response:

{
  "id": "TOTAL_TDS_GAME_mahomes_patrick_1@kc_buf_w1|2|GTE",
  "display": "TOTAL_TDS_GAME_mahomes_patrick_1@kc_buf_w1 >= 2",
  "probability": 0.42,
  "matchingWorlds": 4200,
  "totalWorlds": 10000
}

The id and display are derived from the same canonicalised tree — the same expression always produces the same id, regardless of how the consumer wrote it.

7. Response shape

Context-level fields (every context type carries these):

Field Type Notes
worldSetRef String! Server-resolved. Informational under D1.
worldCount Int! Server-resolved from materialised context.
contextVersion Int! Server-resolved.

Per-expression fields (returned by evaluate):

Field Type Notes
id String! Canonical postfix string. Stable identity.
display String! Canonical infix string. Human-readable.
expressionHash String! sha256(id). 64-char lowercase hex.
resultType ResultType! BOOLEAN if root op classifies as boolean, else NUMERIC.
probability Float Populated when resultType = BOOLEAN. matchingWorlds / totalWorlds.
matchingWorlds Int Populated when resultType = BOOLEAN.
validWorlds Int! Worlds with a non-NaN value.
totalWorlds Int! worldCount.
mean median mode modeFrequency stdDev min max Float! / Int! Lazy — computed only when selected.
resolvedOutcomeIds [OutcomeId!]! The concrete outcome IDs the expression resolved to.
evaluationMs Int! Inner-ring wall time, milliseconds.
warnings [EvalWarning!]! Type-checker observations that did not block evaluation.

Summary statistics (mean, median, etc.) are computed on demand. A query selecting only { probability matchingWorlds } skips sorting entirely.

Full response-shape reference: spec §10.4.

8. Where the boundary sits

The integration line is IOutcomeContextStore. Everything above it is query layer; everything below it is storage.

Storage owns: - The ClickHouse cluster, the schemas, the writers, the merge / partitioning behaviour. - The IOutcomeContextStore implementation. - The outcome template catalog (data, persistence, the build step from infix authoring → postfix templates). - The pre-simulation catalog (data, persistence, lifecycle per run). - The OutcomeIdParser shared utility (lives in Accumulation, used by both sides). - Schema migrations and operational concerns (caching, observability when added).

Query layer owns: - The GraphQL surface (HotChocolate). - ContextRepository — adapter from IOutcomeContextStore to the IOutcomeContext evaluator interface. - The five hard-rule enforcement (now four, post-D1). - ExpressionCanonical — canonicalisation, derived expansion, postfix translation. - TypeChecker — type rules and severities. - PostfixEvaluator — the inner ring. - The error taxonomy and response shape.

Shared: - The Accumulation package types: OutcomeRow, OutcomeDefinition, GameOutcomeContext, SeasonOutcomeContext. Both sides reference one assembly. - Outcome ID format and the OutcomeIdParser that enforces it. - The TimePeriod vocabulary. - The catalog vocabulary — OutcomeType values are storage-side data; the GraphQL closed enum tracks them. Adding a new outcome type to storage requires a coordinated client update (this is intentional; see spec §10.1).

9. Code locations

Concern Location
Storage contracts (sport-agnostic) lbs.foundry/src/OutcomeContext/LBS.OutcomeContext.Contracts/
Query engine (sport-agnostic) lbs.foundry/src/OutcomeContext/LBS.OutcomeContext.Query/
Query engine tests lbs.foundry/src/OutcomeContext/LBS.OutcomeContext.Query.Tests/
American Football catalogue + accumulators (sport-coupled) lbs.foundry/src/Models/AmericanFootball/LBS.Model.AmericanFootball.Accumulation/AmericanFootball/
ClickHouse storage (schemas + read/write seams) lbs.foundry/src/OutcomeContext/LBS.OutcomeContext.Storage/ClickHouse/CREATE TABLE DDL is inlined in ClickHouseOutcomeContextWriter.cs (no separate schemas class); ClickHouseOutcomeContextStore.cs is the read side
Game accumulator Accumulation/AmericanFootball/AmericanFootballOutcomeAccumulator.cs (composes 7 per-play accumulators + TdOrdinalDeriver)
Season accumulator Accumulation/AmericanFootball/AmericanFootballSeasonAccumulator.cs (refactored 2026-04 to fixed-array storage; ~1.3 GB at 100K)
Storage experiment status docs/outcome-context/evaluations/storage-experiment/status-summary.md
Schema review history docs/clickhouse-schema-review.md
In-memory fixtures were built in ContextRepository.BuildFixtures during the storage experiment; the shipped query layer reads from the storage-backed IOutcomeContextStore.

11. The convergence work

The cleanups required on each side are listed in the gap analysis. At a glance:

Storage: - Add Definition (postfix template string) to the catalog; persist the catalog. - Migrate read-side derivations (ANYTIME_TD, COMPLETION_PCT, etc.) from C# code into declarative templates. Inventory which derivations move cleanly and which are escape-hatch sim-produced booleans (§11.8.6). - Split IOutcomeCatalogue into a roster-agnostic template catalog and a roster-driven pre-simulation catalog producer. - Design and persist the pre-simulation catalog. - Extract IOutcomeContextStore as part of the broader storage-layer rework. - Move OutcomeIdParser (or equivalent) to Accumulation for shared use.

Query layer: - Drop Rule 3 enforcement; retire CONTEXT_VERSION_MISMATCH; remove the misalignment fixtures. - Change OutcomeDefinition.Definition from ExpressionInput? to a postfix-template string?. - Update ExpressionCanonical.ExpandDerived to consume postfix templates with slot substitution. - Replace static OutcomeCatalog with a templated catalog backed by storage's template registry, plus pre-sim-catalog lookups for concrete materialisation. - Replace in-memory fixture builder with IOutcomeContextStore calls. - Update CLAUDE.md, spec sections §4 Rule 3, §9.1, §9.2, §9.3, §10.3, §11.1, §11.8.2.

The Phase 3 code-convergence plan sequences these. Natural order: D1 cleanup (smallest, removes code) → D2 catalog/Definition changes → D3 catalog split + pre-sim catalog → D4 read interface during the storage-layer rework.

12. Open items

Three sub-designs are deliberately deferred until shapes firm up during integration. Each is summarised in the gap analysis and gets its own short doc when work begins.

Sub-design Owner Open questions
D2 follow-up — Definition build pipeline Storage + query Authoring source-of-truth (YAML? code-gen? hand-written postfix?), build-time validation, cross-template references and cycle detection, catalog versioning for cache invalidation.
D3 follow-up — Pre-simulation catalog Storage Canonical name, persistence target, identification scheme, slot/role grammar for templates, lifecycle (mutable mid-run? frozen at sim start?).
D4 follow-up — IOutcomeContextStore method shape Storage Run filtering semantics when version omitted, bulk-read shape, projection support, whether one store or two.

Spec sections still [DRAFT] or [TBD] after this convergence: - §11.2 Entity resolution - §11.3 contextVersion pinning - §11.5 Expression persistence and canonical serialisation - §11.6 Outcome value semantics and engine internal representation - §11.7 Batched evaluate (Wave 2) - §12 Operational concerns (caching, observability, schema versioning) — Wave 3

13. Pointers