atomic-write · May 8, 2026
mimo takes the round with 29.0/30 — spec 10.0, quality 19.0. 7 models, $0.259 spent on outputs. Hidden tests: all passed.
mimo takes the round with 29.0/30 — spec 10.0, quality 19.0. 7 models, $0.259 spent on outputs. Hidden tests: all passed.
total = peer-judged spec /15 + quality /15. hidden-tests gate the verdict.
| impl | total | spec | qual | build | tests | verdict |
|---|---|---|---|---|---|---|
| 01 mimo mimo-v2.5-pro | 29.0 | 10.0 | 19.0 | pass | 12/12 | ship-with-cleanup |
| 02 kimi kimi-k2.6 | 24.0 | 10.0 | 14.0 | pass | 12/12 | ship-with-cleanup |
| 03 minimax minimax-m2.5 | 24.0 | 9.0 | 15.0 | pass | 12/12 | ship-with-cleanup |
| 04 qwen qwen3.6-plus | 24.0 | 9.0 | 15.0 | pass | 12/12 | ship-with-cleanup |
| 05 deepseek deepseek-v4-pro | 23.0 | 9.0 | 14.0 | pass | 12/12 | ship-with-cleanup |
| 06 deepseek-flash deepseek-v4-flash | 21.0 | 8.0 | 13.0 | pass | 12/12 | ship-with-cleanup |
| 07 glm glm-5.1 | 21.0 | 9.0 | 12.0 | pass | 12/12 | ship-with-cleanup |
| deepseek deepseek-v4-pro | 1 | 3m29s | $0.051 | 98.4k | — | ✓ |
| deepseek-flash deepseek-v4-flash | 1 | 2m5s | $0.0046 | 102.7k | — | ✓ |
| glm glm-5.1 | 1 | 1m39s | $0.058 | 72.7k | — | ✓ |
| kimi kimi-k2.6 | 1 | 5m46s | $0.080 | 115.7k | — | ✓ |
| mimo mimo-v2.5-pro | 1 | 49s | $0.034 | 80.7k | — | ✓ |
| minimax minimax-m2.5 | 1 | 20s | $0.0099 | 99.8k | — | ✓ |
| qwen qwen3.6-plus | 1 | 53s | $0.022 | 115.7k | — | ✓ |
Δ = self − peer median. +red = overrated self. −green = humble.
| impl | self spec | peer med | Δ spec | self qual | peer med | Δ qual |
|---|---|---|---|---|---|---|
| deepseek deepseek-v4-pro | 10.0 | 9.0 | +1.0 | 14.0 | 14.0 | 0.0 |
| deepseek-flash deepseek-v4-flash | 9.0 | 8.0 | +1.0 | 15.0 | 13.0 | +2.0 |
| glm glm-5.1 | 10.0 | 9.0 | +1.0 | 12.0 | 12.0 | 0.0 |
| kimi kimi-k2.6 | — | 10.0 | — | — | 14.0 | — |
| mimo mimo-v2.5-pro | 9.0 | 10.0 | -1.0 | 17.0 | 19.0 | -2.0 |
| minimax minimax-m2.5 | 10.0 | 9.0 | +1.0 | 16.0 | 15.0 | +1.0 |
| qwen qwen3.6-plus | 9.0 | 9.0 | 0.0 | 16.0 | 15.0 | +1.0 |
| judge | 1st | 2nd | 3rd |
|---|---|---|---|
| deepseek deepseek-v4-pro | deepseek (10) | kimi (10) | mimo (10) |
| deepseek-flash deepseek-v4-flash | glm (10) | kimi (10) | mimo (10) |
| glm glm-5.1 | deepseek-flash (10) | glm (10) | mimo (10) |
| kimi kimi-k2.6 | — | — | — |
| mimo mimo-v2.5-pro | glm (9) | kimi (9) | mimo (9) |
| minimax minimax-m2.5 | deepseek (10) | deepseek-flash (10) | glm (10) |
| qwen qwen3.6-plus | deepseek (10) | kimi (10) | mimo (10) |
| impl | min | max | range | stdev | judges |
|---|---|---|---|---|---|
| deepseek deepseek-v4-pro | 8.0 | 10.0 | 2.0 | 0.82 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
| deepseek-flash deepseek-v4-flash | 7.0 | 10.0 | 3.0 | 1.21 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
| glm glm-5.1 | 9.0 | 10.0 | 1.0 | 0.55 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
| kimi kimi-k2.6 | 9.0 | 10.0 | 1.0 | 0.52 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
| mimo mimo-v2.5-pro | 9.0 | 10.0 | 1.0 | 0.41 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
| minimax minimax-m2.5 | 8.0 | 10.0 | 2.0 | 0.82 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
| qwen qwen3.6-plus | 7.0 | 9.0 | 2.0 | 0.82 | deepseek, deepseek-flash, glm, mimo, minimax, qwen |
builds/deepseek/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | self | pass | 10.0 | 14.0 | ship-with-cleanup | Lean and correct: minimal code that hits every spec requirement, but bare wit... |
| deepseek-flash | peer | pass | 9.0 | 15.0 | ship-with-cleanup | Minimal and mostly correct implementation, concise but lacking comments and w... |
| glm | peer | pass | 9.0 | 12.0 | ship-with-cleanup | Shortest and most direct implementation; passes all hard-fails but uses full ... |
| kimi | peer | fail | — | — | — | — |
| mimo | peer | pass | 8.0 | 14.0 | ship-with-cleanup | Most concise implementation with clean structure — except Exception instead o... |
| minimax | peer | pass | 10.0 | 14.0 | ship-with-cleanup | Minimal, correct implementation. No comments. May be missing explicit FileNot... |
| qwen | peer | pass | 10.0 | 14.0 | ship-with-cleanup | Fully spec-compliant with explicit error checks and correct fsync ordering, b... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_writebuilds/deepseek-flash/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | peer | pass | 8.0 | 13.0 | ship-with-cleanup | Functional but bare: lacks explicit exception guards for SPEC-mandated error ... |
| deepseek-flash | self | pass | 9.0 | 15.0 | ship-with-cleanup | Minimal and correct implementation, hurt only by a missing IsADirectoryError ... |
| glm | peer | pass | 10.0 | 12.0 | ship-with-cleanup | Correct spec-compliant implementation with proper mode handling before replac... |
| kimi | peer | fail | — | — | — | — |
| mimo | peer | pass | 7.0 | 15.0 | ship-with-cleanup | Minimal, correct implementation with proper fsync and cleanup — missing IsADi... |
| minimax | peer | pass | 10.0 | 16.0 | ship-with-cleanup | Robust atomic write implementation with proper fsync, temp-in-same-dir, and m... |
| qwen | peer | pass | 8.0 | 13.0 | ship-with-cleanup | Solid atomic write with correct fsync ordering and thorough cleanup, but lack... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_writebuilds/glm/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | peer | pass | 9.0 | 12.0 | ship-with-cleanup | Functional but muddled: encoding-sentinel dispatch is fragile, nested excepti... |
| deepseek-flash | peer | pass | 10.0 | 12.0 | ship-with-cleanup | Functionally correct but has a latent double-close bug in the error handling ... |
| glm | self | pass | 10.0 | 12.0 | ship-with-cleanup | Solid spec-compliant implementation with correct mode handling and full durab... |
| kimi | peer | fail | — | — | — | — |
| mimo | peer | pass | 9.0 | 14.0 | ship-with-cleanup | Elegant encoding-parameter pattern avoids code duplication — solid error hand... |
| minimax | peer | pass | 10.0 | 14.0 | ship-with-cleanup | Functional implementation with correct fsync and mode handling. No explanator... |
| qwen | peer | pass | 9.0 | 11.0 | ship-with-cleanup | Spec-compliant except for missing FileNotFoundError, but the encoding-as-disc... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_writebuilds/kimi/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | peer | pass | 10.0 | 13.0 | ship-with-cleanup | Best exception specificity with proper errno codes and strerror messages, but... |
| deepseek-flash | peer | pass | 10.0 | 15.0 | ship-with-cleanup | Correct implementation with good structure but lacks explanatory comments for... |
| glm | peer | pass | 9.0 | 12.0 | ship-with-cleanup | Professionally structured with errno-based exceptions and O_EXCL atomic creat... |
| kimi | self | fail | — | — | — | — |
| mimo | peer | pass | 9.0 | 15.0 | ship-with-cleanup | Robust implementation with explicit error checks and UUID-based temp files — ... |
| minimax | peer | pass | 10.0 | 15.0 | ship-with-cleanup | Solid implementation with proper errno-based exceptions and UUID-based unique... |
| qwen | peer | pass | 10.0 | 13.0 | ship-with-cleanup | Fully spec-compliant with good errno-based error messages and atomic O_EXCL f... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_writebuilds/mimo/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | peer | pass | 10.0 | 19.0 | ship-with-cleanup | Polished implementation: correct durability ordering, comprehensive error han... |
| deepseek-flash | peer | pass | 10.0 | 19.0 | ship-with-cleanup | Solid, well-structured implementation covering all spec requirements with cle... |
| glm | peer | pass | 10.0 | 16.0 | ship-with-cleanup | Best-in-class implementation: fchmod on fd before close for optimal atomicity... |
| kimi | peer | fail | — | — | — | — |
| mimo | self | pass | 9.0 | 17.0 | ship-with-cleanup | Clean, well-decomposed implementation with correct symlink handling and robus... |
| minimax | peer | pass | 10.0 | 19.0 | ship-with-cleanup | Exemplary implementation with clear documentation, proper symlink handling, a... |
| qwen | peer | pass | 10.0 | 18.0 | ship-with-cleanup | Best-structured implementation: clean helper decomposition, uses fchmod on op... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_writebuilds/minimax/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | peer | pass | 9.0 | 15.0 | ship-with-cleanup | Well-structured with idiomatic CLI and good symlink handling; explicit mode a... |
| deepseek-flash | peer | pass | 10.0 | 17.0 | ship-with-cleanup | Very clean implementation with excellent structure and all spec items handled... |
| glm | peer | pass | 9.0 | 13.0 | ship-with-cleanup | Feature-complete with explicit error checks and symlink handling; marred by c... |
| kimi | peer | fail | — | — | — | — |
| mimo | peer | pass | 8.0 | 13.0 | ship-with-cleanup | Well-structured implementation with good helpers — bare except, unused return... |
| minimax | self | pass | 10.0 | 16.0 | rewrite | Well-structured implementation with good documentation but applies mode AFTER... |
| qwen | peer | pass | 10.0 | 17.0 | ship-with-cleanup | Well-structured with good helper decomposition and full spec compliance, but ... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_writebuilds/qwen/rounds/atomic-write-2026-05-08
| judge | tier | build | spec | qual | verdict | note |
|---|---|---|---|---|---|---|
| deepseek | peer | pass | 9.0 | 15.0 | ship-with-cleanup | Clean, well-typed implementation with good decomposition; sole spec gap is mi... |
| deepseek-flash | peer | pass | 9.0 | 16.0 | ship-with-cleanup | Clean and correct implementation with good structure, missing only an IsADire... |
| glm | peer | pass | 9.0 | 12.0 | ship-with-cleanup | Correct mode handling with before-replace application; missing explicit IsADi... |
| kimi | peer | fail | — | — | — | — |
| mimo | peer | pass | 7.0 | 14.0 | ship-with-cleanup | Lean, correct implementation with good error handling and type checks — missi... |
| minimax | peer | pass | 9.0 | 16.0 | ship-with-cleanup | Clean, well-structured implementation with type hints. Missing explicit IsADi... |
| qwen | self | pass | 9.0 | 16.0 | ship-with-cleanup | Clean and minimal implementation with correct fsync ordering and thorough cle... |
test_bytes_basic_writetest_cli_stdin_to_pathtest_concurrent_writers_no_corruptiontest_missing_parent_raises_filenotfoundtest_mode_applied_when_settest_mode_preserved_when_none_and_target_existstest_no_tmp_residue_on_open_failuretest_no_tmp_residue_on_successtest_path_is_directory_raisestest_replaces_existingtest_symlink_writes_to_targettest_text_basic_write| impl | slug | loc | wall | tokens | cost | tests | $/test |
|---|---|---|---|---|---|---|---|
| $deepseek-flash | deepseek-v4-flash` | 62 | 2m05s | 102.7k | $0.00000 | 12 | $0.0004 |
| minimax | minimax-m2.5` | 105 | 0m20s | 99.8k | $0.010 | 12 | $0.0008 |
| qwen | qwen3.6-plus` | 77 | 0m53s | 115.7k | $0.020 | 12 | $0.0018 |
| mimo | mimo-v2.5-pro` | 93 | 0m49s | 80.7k | $0.030 | 12 | $0.0028 |
| deepseek | deepseek-v4-pro` | 51 | 3m28s | 98.4k | $0.050 | 12 | $0.0043 |
| glm | glm-5.1` | 68 | 1m38s | 72.7k | $0.060 | 12 | $0.0049 |
| kimi | kimi-k2.6` | 72 | 5m46s | 115.7k | $0.080 | 12 | $0.0067 |