testwall

v1.0.1

Published

4 months ago

Enforce test immutability for agentic TDD workflows

0High
0Medium
0Low

metacogdev

testing tdd llm agent immutability

testwall

Enforce test immutability for agentic TDD workflows.

LLM coding agents routinely cheat test gates — weakening assertions, deleting failing tests, modifying config, or special-casing inputs. Research (ImpossibleBench, arxiv 2510.20270) shows frontier models exploit test cases 76% of the time when given write access, but cheating drops to near zero when tests are read-only. testwall enforces that boundary.

How it works

testwall init       # snapshot test files + compute SHA-256 checksums
testwall lock       # chmod 444 — agent can read but not modify
testwall run        # restore from snapshot, then execute tests
testwall verify     # check checksums — exit 1 on any mismatch
testwall accept     # verify + unlock + clean up snapshot

Even if an agent bypasses file permissions, testwall run restores the original tests from snapshot before executing them. testwall verify catches any tampering at the checksum level.

Install

# Rust
cargo install testwall

# Python
pip install testwall

# Node
npm install -g testwall

Quick start

# 1. Initialize — snapshots all test files matching default patterns
testwall init

# 2. Lock test files before handing off to an implementing agent
testwall lock

# 3. Agent implements... then run tests against the immutable snapshot
testwall run

# 4. If tests pass and nothing was tampered with, accept the result
testwall accept

Commands

`testwall init [-p PATTERN...] [-c CMD]`

Scan for test files, compute checksums, and store snapshots in .testwall/.

Without -p, uses built-in patterns for Python, Rust, JavaScript/TypeScript, Go, Java, and Kotlin — plus common config files like pytest.ini, jest.config.*, and .cargo/config.toml.

testwall init                              # auto-detect
testwall init -p "tests/**/*.py" -p "conftest.py"  # explicit patterns
testwall init -c "pytest -x"              # record the test command

`testwall lock`

Set all snapshotted test files to read-only (chmod 444).

`testwall unlock`

Restore write permissions on test files.

`testwall run [-c CMD] [-- extra args]`

Restore test files from snapshot, then execute the test runner. This is the tamper-proof execution path — even if the agent modified the working copies, the originals run.

testwall run                    # use command from init or auto-detect
testwall run -c "pytest"        # override test command
testwall run -- -x --no-header  # forward args to test runner

`testwall verify [--report-only]`

Compare current test file checksums against the manifest. Exits with code 1 if any file was modified or deleted.

testwall verify                 # fail on mismatch
testwall verify --report-only   # print report, always exit 0

`testwall accept`

The merge gate. Runs verification, then unlocks files and cleans up the snapshot directory. Rejects if any tampering is detected.

`testwall status`

Show the current manifest: file count, lock state, snapshot presence, patterns, and test command.

Default patterns

testwall ships with patterns for common test conventions:

| Ecosystem | Patterns | |-----------|----------| | Python | test_*.py, *_test.py, tests/**/*.py, conftest.py | | Rust | tests/**/*.rs | | JS/TS | **/*.test.{js,ts,tsx}, **/*.spec.{js,ts,tsx} | | Go | **/*_test.go | | Java/Kotlin | src/test/**/*.java, src/test/**/*.kt | | Config | pytest.ini, setup.cfg, jest.config.*, vitest.config.*, .cargo/config.toml |

Typical workflow

  You (test author)          testwall            Agent (implementer)
  ─────────────────          ────────            ───────────────────
  Write tests
          ├──── testwall init ────►
          ├──── testwall lock ────►
          │                              Agent implements code
          │                              Agent tries to edit tests → DENIED
          │                              Agent runs testwall run
          │                        ◄──── tests execute from snapshot
          │                              Tests pass
          ├──── testwall accept ──►
          │     ✓ checksums match
          │     ✓ files unlocked
          │     ✓ snapshot cleaned

What it catches

Weakened assertions (assert x > 0 → assert True)
Deleted test cases
Modified test config (conftest.py, jest.config.*)
Special-cased test inputs
Swapped test runner flags
Any byte-level change to snapshotted files

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

testwall

How it works

Install

Quick start

Commands

testwall init [-p PATTERN...] [-c CMD]

testwall lock

testwall unlock

testwall run [-c CMD] [-- extra args]

testwall verify [--report-only]

testwall accept

testwall status

Default patterns

Typical workflow

What it catches

License

`testwall init [-p PATTERN...] [-c CMD]`

`testwall lock`

`testwall unlock`

`testwall run [-c CMD] [-- extra args]`

`testwall verify [--report-only]`

`testwall accept`

`testwall status`