Environment: Azure Container Apps Job (D32-benchmark, 32 vCPU / 64 GiB), westus3 ClickHouse Cloud Production tier (3 replicas × 16 vCPU × 64 GiB = 48 vCPU / 192 GiB) Connection: HTTPS, Compression=true, set_async_insert=1, set_wait_for_async_insert=0 PBP sink: none (--no-pbp) Merge parallelism cap: 16 (was 4 on smaller tier — raised in commit ba0bb337) Job execution: oc-exp-1k-p32-7c1jm5n Backend: clickhouse, Scale: 100000 worlds, Environment: local Streaming mode: totalWorlds=100000, chunkSize=500 [PBP DISABLED] [parallel=32] PBP sink: local at ./pbp-parquet Streaming Run Report -------------------- Total wall time: 1886.50 s Simulation time: 1412.06 s OC write time: 327.67 s PBP write time: 0.00 s Merge time: 136.50 s Peak working set: 18962.6 MB Chunks completed: 200 OC rows (merged): 244,800 PBP rows (written): 0 Notes: - 100K OC-only rebaseline on Production 3×16 (48 vCPU aggregate). - Vs prior Production Scale (2×4) baseline of 2,991s: total dropped 37% (2,991s → 1,887s), with the entire improvement coming from merge: 1,210s → 137s, an 8.9× speedup. - Merge cap raised from 4 to 16 to exploit the new per-replica RAM budget (was constrained to 4 on 14.4 GiB/replica; 64 GiB/replica comfortably handles 16 × ~680 MB concurrent arrayFlatten working sets). - Sim phase unchanged (client-bound, prototype). Now 75% of wall time.