break-sandbox · May 14, 2026

7 models, each attacking every other's round-1 sandbox. Every attack was contained — no sandbox was breached. $0.375 spent on outputs. Objective scoring — a per-test escape counts, no judges.

writeup Round 2 (Break): an inconclusive round, and an oracle that earned its keep Seven models wrote exploit suites against each other's sandboxes. Nothing landed — and the run only means anything because a reference oracle threw out the cheese.

round ranking

round 2 (Break) — objective. defense-weighted: ranked by breaches taken (lower better), then breaches landed. models with identical records share a rank. a per-round ranking only — no elimination.

impldefender scoreattacker score
01 deepseek deepseek-v4-pro 00
01 deepseek-flash deepseek-v4-flash 00
01 glm glm-5.1 00
01 kimi kimi-k2.6 00
01 mimo mimo-v2.5-pro 00
01 minimax minimax-m2.5 00
01 qwen qwen3.6-plus 00