Environment: Azure Container Apps Job (D32-benchmark, 32 vCPU / 64 GiB), westus3
ClickHouse Cloud Production Scale (2 replicas × 4 CPU × ~8 GiB)
Connection: HTTPS, Compression=true, set_async_insert=1, set_wait_for_async_insert=0
PBP sink: clickhouse (direct bulk-copy, 4-way parallel per chunk)
Job execution: oc-exp-1k-p32-km50pm8

Backend: clickhouse, Scale: 10000 worlds, Environment: local
Streaming mode: totalWorlds=10000, chunkSize=500 [parallel=32]
PBP sink: clickhouse at ./pbp-parquet

Streaming Run Report
--------------------
Total wall time:         1119.68 s
Simulation time:          136.47 s
OC write time:             51.40 s
PBP write time:           818.46 s
Merge time:               106.20 s
Peak working set:        41830.7 MB
Chunks completed:             20
OC rows (merged):        244,800
PBP rows (written):   419,758,294

Notes:
- Validated the PBP-parallel path at 10K scale. 4-way concurrent PBP
  bulk-copies per chunk.
- PBP throughput: 419.76M rows in 818.46s = 513K rows/sec — 2.9x better
  per-row than 1K (async_insert amortises its overhead across larger
  bulks).
- First attempt at 16 CPU / 32 GiB OOM'd at the PBP parallel-slice
  serialisation step (peak client memory ~24 GB during parallel PBP
  writes was too close to the 32 GiB limit). Raised container to 32 CPU
  / 64 GiB; peak was 41.8 GB — fits comfortably.
