methodology changelog

What changed and when. Round-to-round comparability, made auditable. Lineup deltas auto-derived; spec/rubric edits pulled from each round's review markdown.

  1. round 1 May 5, 2026

    Baseline. 7 models in the initial lineup: deepseek, deepseek-flash, glm, kimi, mimo, minimax, qwen.