Environment: Azure Container Apps Job (D32-benchmark, 32 vCPU / 64 GiB), westus3 ClickHouse Cloud Production Scale (2 replicas × 4 CPU × ~8 GiB) Connection: HTTPS, Compression=true, set_async_insert=1, set_wait_for_async_insert=0 PBP sink: clickhouse (direct bulk-copy, 4-way parallel per chunk) Job execution: oc-exp-1k-p32-km50pm8 Backend: clickhouse, Scale: 10000 worlds, Environment: local Streaming mode: totalWorlds=10000, chunkSize=500 [parallel=32] PBP sink: clickhouse at ./pbp-parquet Streaming Run Report -------------------- Total wall time: 1119.68 s Simulation time: 136.47 s OC write time: 51.40 s PBP write time: 818.46 s Merge time: 106.20 s Peak working set: 41830.7 MB Chunks completed: 20 OC rows (merged): 244,800 PBP rows (written): 419,758,294 Notes: - Validated the PBP-parallel path at 10K scale. 4-way concurrent PBP bulk-copies per chunk. - PBP throughput: 419.76M rows in 818.46s = 513K rows/sec — 2.9x better per-row than 1K (async_insert amortises its overhead across larger bulks). - First attempt at 16 CPU / 32 GiB OOM'd at the PBP parallel-slice serialisation step (peak client memory ~24 GB during parallel PBP writes was too close to the 32 GiB limit). Raised container to 32 CPU / 64 GiB; peak was 41.8 GB — fits comfortably.