Environment: Azure Container Apps Job (D32-benchmark, 32 vCPU / 64 GiB), westus3 ClickHouse Cloud Production Scale (2 replicas × 4 CPU × ~8 GiB) Connection: HTTPS, Compression=true, set_async_insert=1, set_wait_for_async_insert=0 PBP sink: clickhouse (direct bulk-copy, 4-way parallel per chunk) Job execution: oc-exp-1k-p32-p0xsb6u Replica timeout: 14,400s (4 hours) Result: DeadlineExceeded — job terminated at 4h without completing Backend: clickhouse, Scale: 100000 worlds, Environment: local Streaming mode: totalWorlds=100000, chunkSize=500 [parallel=32] PBP sink: clickhouse at ./pbp-parquet (no Streaming Run Report — process terminated mid-run) Analysis of the failure: - Started 2026-04-19T18:20:12 UTC, terminated 2026-04-19T22:20:18 UTC - Linear projection from 10K (818s PBP write, 513K rows/sec throughput) gave 100K PBP write = ~8,200s = 2.3h, total ~2.5h. Should have fit. - Actual: still in the PBP write phase at 4h. That's super-linear, at least 1.75x worse than the 10K-based projection. - Most likely cause: the play_by_play table is now PARTITION BY season_id (single partition per season). At 100K scale a single partition has ~4.2B rows. ClickHouse MergeTree creates parts and eventually merges them in the background; at this volume the background-merge pressure on the single partition may cap foreground INSERT throughput in a way that doesn't show at 10K (where the partition is only ~420M rows). - Secondary factor: 4 client workers × 32 parallel_workers = potentially overwhelming the 2-replica Cloud tier's insert throughput ceiling for sustained hours. What to try next (any one should unblock 100K-with-PBP): 1. Bigger CH Cloud tier. Production Scale has 2 × 4 CPU replicas; bumping to 4 × 16 CPU would 8x the insert-side capacity. 2. Partition PBP by (season_id, intDiv(world_id, 10000)). Still <100 partitions per run (a hard CH Cloud setting), but spreads data across 10 partitions so background merges happen in parallel. 3. Drop PBP from the critical-path 100K pipeline. Ship OC-only at 49.85 min as the primary artifact; run PBP as a secondary bulk-insert pipeline when needed. Highest validated with-PBP scale: 10K (1,120s / 18.7 min). See the 10K run file for details.