apply-edit

A single-file Python module providing one operation: search-replace patching of file contents. This is the primitive that agent harnesses (Cursor, aider, Claude Code, etc.) use to translate model-emitted edits into actual file changes.

languagepython
entrypointapply_edit.py
test runnerpytest
kindcode

No runs yet — lineup hasn't been exercised against this task.

contract

must

  • Python 3.10+, stdlib only — no pip install, no new dependencies.
  • apply_edit must be a pure function — no I/O, no logging, no globals.
  • No regex. Literal substring matching only.
  • Match is byte-exact at the string level — no whitespace
  • The three exception classes must inherit as documented in SPEC.md

out of scope

  • Multi-block patches (one edit per call only).
  • Fuzzy / context-aware matching. The point is strict matching.
  • Unified-diff parsing.
  • File I/O inside apply_edit itself.

full contract

The raw documents handed to every model and every judge. Read these as the source of truth.

SPEC spec.md what done looks like

apply_edit.py — implementation spec

A single-file Python module providing one operation: search-replace patching of file contents. This is the primitive that agent harnesses (Cursor, aider, Claude Code, etc.) use to translate model-emitted edits into actual file changes.

The point of the function is not "replace text". str.replace already does that. The point is to apply edits safely — raising loudly when the edit is ambiguous or impossible, never silently mutating the wrong location.

Public API

def apply_edit(
    file_text: str,
    old: str,
    new: str,
    *,
    replace_all: bool = False,
) -> str: ...

Returns a new string with old replaced by new in file_text. The input is never mutated (strings are immutable anyway, but: no other side effects).

Exceptions

Three exception types, all module-level:

class EditError(Exception): ...
class EditNotFound(EditError): ...
class EditAmbiguous(EditError): ...

The two specific subclasses must inherit from EditError so callers can catch either the specific failure or EditError as a base.

Behaviour

Case What must happen
old is the empty string Raise ValueError. An empty needle is never a valid edit.
old does not appear in file_text Raise EditNotFound.
old appears exactly once Return file_text with that single occurrence replaced by new.
old appears 2+ times and replace_all=False Raise EditAmbiguous. Do not silently replace the first match. This is the whole reason this function exists rather than str.replace.
old appears 2+ times and replace_all=True Replace every occurrence. Return the new string.
old == new Still validated for presence/ambiguity per the rules above; if it would otherwise succeed, return file_text unchanged.

Whitespace, line endings, encoding

  • Match is byte-exact at the string level — no whitespace normalization, no leading/trailing strip, no case folding. Indentation must match exactly.
  • file_text may contain \n, \r\n, or a mix. The function operates on the string as given; it does not normalize line endings.
  • The input is str, not bytes. Callers handle encoding.

Error messages

Exception messages must be informative enough for an agent to react:

  • EditNotFound: include the first ~80 chars of old (truncated with if longer) so logs show what was searched for.
  • EditAmbiguous: include the match count, e.g. "old string matched 4 times; pass replace_all=True to replace all".

CLI

python apply_edit.py <path> <<EOF
<<<<<<< OLD
<old text>
=======
<new text>
>>>>>>> NEW
EOF

The CLI reads a single edit block from stdin in the format above (literal <<<<<<< OLD, =======, >>>>>>> NEW markers; no leading spaces), applies it to the file at <path>, and writes the result back to that path.

  • Exit 0 on a successful single-match edit.
  • Exit 2 on EditNotFound. Print the exception message to stderr.
  • Exit 3 on EditAmbiguous. Print the exception message to stderr.
  • Exit 1 on any other error (missing file, malformed stdin, etc.) with a stderr message.
  • --replace-all flag: if passed, set replace_all=True for the call.

The CLI must not require any third-party libraries.

Hard constraints

  • Python 3.10+, stdlib only.
  • Pure function semantics for apply_edit: no logging, no I/O, no global state. The CLI is a separate main() that does I/O.
  • No regex matching — search is literal substring. (Regex is what half of these tools get wrong; this task is the simple version.)

Out of scope

  • Multi-block patches (one edit per call only).
  • Fuzzy / context-aware matching. The point is strict matching.
  • Unified-diff parsing.
  • File I/O inside apply_edit itself.
PRMT prompt.md what the model reads

Task: implement apply_edit.py

Read SPEC.md in this directory. Implement apply_edit.py per spec: one library function (apply_edit), three exception classes (EditError, EditNotFound, EditAmbiguous), plus a CLI entry point.

This task covers only apply_edit.py. Do not create helper modules, test files, or packaging metadata.

Reference (read carefully)

Below is a starter implementation that someone tried to ship. It has at least one bug — at least one case where it does not match the behaviour required by SPEC.md. Your job:

  1. Decide what is wrong with it.
  2. Write a correct apply_edit.py from scratch (do not paste this in verbatim).
  3. Add a single short comment at the top of your file naming the bug you found, in the form: # bug in reference: <one line>.

You are not required to keep the reference's structure. Use whatever shape is cleanest. The only requirement is that the resulting module passes the spec.

class EditError(Exception):
    pass


class EditNotFound(EditError):
    pass


class EditAmbiguous(EditError):
    pass


def apply_edit(file_text, old, new, *, replace_all=False):
    if not old:
        raise ValueError("old must not be empty")
    if old not in file_text:
        raise EditNotFound(f"old string not found: {old[:80]!r}")
    if replace_all:
        return file_text.replace(old, new)
    return file_text.replace(old, new, 1)


def main():
    import sys
    if len(sys.argv) < 2:
        print("usage: apply_edit.py <path> [--replace-all]", file=sys.stderr)
        sys.exit(1)
    path = sys.argv[1]
    replace_all = "--replace-all" in sys.argv[2:]
    raw = sys.stdin.read()
    # parse <<<<<<< OLD ... ======= ... >>>>>>> NEW block
    try:
        head, rest = raw.split("<<<<<<< OLD\n", 1)
        old, rest = rest.split("\n=======\n", 1)
        new, _ = rest.split("\n>>>>>>> NEW", 1)
    except ValueError:
        print("malformed stdin", file=sys.stderr)
        sys.exit(1)
    with open(path, "r") as f:
        contents = f.read()
    try:
        result = apply_edit(contents, old, new, replace_all=replace_all)
    except EditNotFound as e:
        print(str(e), file=sys.stderr)
        sys.exit(2)
    except EditAmbiguous as e:
        print(str(e), file=sys.stderr)
        sys.exit(3)
    with open(path, "w") as f:
        f.write(result)


if __name__ == "__main__":
    main()

Hard constraints

  • Python 3.10+, stdlib only — no pip install, no new dependencies.
  • apply_edit must be a pure function — no I/O, no logging, no globals.
  • No regex. Literal substring matching only.
  • Match is byte-exact at the string level — no whitespace normalization, no case folding, no line-ending normalization.
  • The three exception classes must inherit as documented in SPEC.md (specific classes inherit from EditError).

Deliverable

A single file apply_edit.py at the worktree root that:

  1. Defines EditError, EditNotFound, EditAmbiguous.
  2. Defines apply_edit(file_text, old, new, *, replace_all=False) -> str matching the spec exactly.
  3. Provides a CLI per SPEC.md's "CLI" section, with the exit-code contract (0 / 1 / 2 / 3) and the --replace-all flag.

What to do when finished

  1. Run a quick smoke test in your head: single match replaces; two matches without replace_all raises; old="" raises ValueError; old not in text raises EditNotFound.
  2. State: "Done. Implementation in apply_edit.py."

What NOT to do

  • Do not modify PROMPT.md or SPEC.md.
  • Do not paste the reference verbatim.
  • Do not add requirements.txt, pyproject.toml, or any other dependency manifest.
  • Do not write test files; the hidden tests are added later.
  • Do not import any third-party package (no regex, no rich, etc.).
RUBR judge_rubric.md how judges score

Judge rubric: apply-edit task

Fill one copy per implementation, saved as output/<label>_rubric.md. Also write output/<label>_scores.json with the structured form (see JUDGE_PROMPT.md).

Implementation reviewed: <label> (e.g. A, B, C) File: implementations/<label>.py

Hard-fail (any miss = fail run)

Cite line numbers when something fails.

  • [ ] apply_edit.py provided as <label>.py
  • [ ] Top-level apply_edit(file_text, old, new, *, replace_all=False) -> str matches SPEC signature
  • [ ] Module defines EditError, EditNotFound, EditAmbiguous
  • [ ] EditNotFound and EditAmbiguous both inherit from EditError
  • [ ] No external Python dependencies (stdlib-only imports)
  • [ ] No regex — literal substring match only
  • [ ] apply_edit is pure: no I/O, no global state, no logging inside it

Hard-fail result: pass / fail If fail, reasons (with line refs):

Spec compliance — score 0–10

Award 1 point per item present and correct. Cite line numbers.

  • [ ] old == "" raises ValueError (not EditError, not silent return)
  • [ ] old not in file_text raises EditNotFound
  • [ ] EditNotFound message includes (a truncated form of) old
  • [ ] Single match: returns file_text with that one occurrence replaced
  • [ ] Multi-match w/ replace_all=False: raises EditAmbiguous (NOT silently replaces first — this is the bug in the reference)
  • [ ] EditAmbiguous message includes the match count
  • [ ] Multi-match w/ replace_all=True: replaces every occurrence
  • [ ] Match is byte-exact: no whitespace normalization, no case folding, no line-ending normalization
  • [ ] CLI exit codes match spec (0 success, 2 not-found, 3 ambiguous, 1 other)
  • [ ] CLI --replace-all flag wired through to the call

Subtotal: __ / 10 Notes:

Code quality — score each 0–5

  • [ ] Clarity — naming, structure, function decomposition: __
  • [ ] Conciseness — no over-engineering, no unused branches: __
  • [ ] Error handling — distinct exception types per spec; CLI exit-code contract honoured: __
  • [ ] Comments — at minimum a # bug in reference: line naming what was wrong; otherwise comments only at non-obvious points: __

Subtotal: __ / 20

Bug-diagnosis bonus (informational, not scored)

Did the model correctly identify the bug in the reference? The expected diagnosis is: "silently replaces only the first occurrence on multi-match instead of raising EditAmbiguous". Note in the rubric whether the model's # bug in reference: comment matches.

One-line summary

Verdict

ship-with-cleanup / rewrite / unusable