Environment: Azure Container Apps Job (D32-benchmark, 32 vCPU / 64 GiB), westus3
ClickHouse Cloud Production tier (3 replicas × 16 vCPU × 64 GiB)
Connection: HTTPS, Compression=true, set_async_insert=1, set_wait_for_async_insert=0
PBP sink: clickhouse with parallel writes
Season context: ENABLED (refactored accumulator)
Job execution: oc-exp-1k-p32-84iykp5
Image: commit 34155f2d (schema changes from the CH docs review)

Schema changes applied in this run:
  1. play_by_play: PARTITION BY (season_id, intDiv(world_id, 10000))
     (was: PARTITION BY season_id — 1 partition per season)
  2. play_by_play: CODEC(Delta, ZSTD) on monotonic integer columns
     (world_id, play_sequence, clock_seconds, drive_*_clock_seconds)
  3. season_outcome_context: added PARTITION BY season_id
  4. experiment_results: added PARTITION BY toYYYYMM(run_timestamp)

RESULT: FAILED at 4h replica deadline. Only ~2,000 of 100,000 worlds
processed (2% complete). Baseline without these changes was 2h 58m
complete. The "optimisations" caused a ~68× slowdown in per-world
throughput.

Diagnostic evidence from CH Cloud system.query_log (last 5h):
- Insert queries: 387 (baseline ~30,000 for a complete run)
- Select queries: 8,737 (ratio 22:1 select-to-insert)
- Individual insert speed when they did run: healthy (21s for 5.2M
  rows = 250K rows/sec, matches historical)
- No exceptions, no MEMORY_LIMIT errors, no server-side failures

The 22:1 select-to-insert ratio is the smoking gun. ClickHouseBulkCopy
calls InitAsync before every insert to fetch column metadata (the
'SELECT ... WHERE ?=?' pattern). With the new partition expression on
play_by_play (intDiv(world_id, 10000)), each of those InitAsync calls
became much slower — possibly because the driver has to parse and
understand the computed partition expression for each call. The
client then appeared stuck between inserts, making almost no progress.

Action: reverted the schema changes in commit 150ec156. Primary
baseline remains the 100K full-pipeline+season run at 2h 58m with
PARTITION BY season_id on play_by_play.

Lessons:
- CH best-practice docs describe table-level design in isolation.
  They don't account for client-library behaviour (BulkCopy's per-
  insert metadata fetch) or our specific orchestration pattern.
- Schema changes from 'review by spec' need validation runs BEFORE
  being committed as improvements.
- Empirical > theoretical: the 2h 58m baseline with 'wrong-by-the-
  book' partitioning beats every proposed improvement we've measured.
