testwall
v1.0.1
Published
Enforce test immutability for agentic TDD workflows
Maintainers
Readme
testwall
Enforce test immutability for agentic TDD workflows.
LLM coding agents routinely cheat test gates — weakening assertions, deleting failing tests, modifying config, or special-casing inputs. Research (ImpossibleBench, arxiv 2510.20270) shows frontier models exploit test cases 76% of the time when given write access, but cheating drops to near zero when tests are read-only. testwall enforces that boundary.
How it works
testwall init # snapshot test files + compute SHA-256 checksums
testwall lock # chmod 444 — agent can read but not modify
testwall run # restore from snapshot, then execute tests
testwall verify # check checksums — exit 1 on any mismatch
testwall accept # verify + unlock + clean up snapshotEven if an agent bypasses file permissions, testwall run restores the original tests from snapshot before executing them. testwall verify catches any tampering at the checksum level.
Install
# Rust
cargo install testwall
# Python
pip install testwall
# Node
npm install -g testwallQuick start
# 1. Initialize — snapshots all test files matching default patterns
testwall init
# 2. Lock test files before handing off to an implementing agent
testwall lock
# 3. Agent implements... then run tests against the immutable snapshot
testwall run
# 4. If tests pass and nothing was tampered with, accept the result
testwall acceptCommands
testwall init [-p PATTERN...] [-c CMD]
Scan for test files, compute checksums, and store snapshots in .testwall/.
Without -p, uses built-in patterns for Python, Rust, JavaScript/TypeScript, Go, Java, and Kotlin — plus common config files like pytest.ini, jest.config.*, and .cargo/config.toml.
testwall init # auto-detect
testwall init -p "tests/**/*.py" -p "conftest.py" # explicit patterns
testwall init -c "pytest -x" # record the test commandtestwall lock
Set all snapshotted test files to read-only (chmod 444).
testwall unlock
Restore write permissions on test files.
testwall run [-c CMD] [-- extra args]
Restore test files from snapshot, then execute the test runner. This is the tamper-proof execution path — even if the agent modified the working copies, the originals run.
testwall run # use command from init or auto-detect
testwall run -c "pytest" # override test command
testwall run -- -x --no-header # forward args to test runnertestwall verify [--report-only]
Compare current test file checksums against the manifest. Exits with code 1 if any file was modified or deleted.
testwall verify # fail on mismatch
testwall verify --report-only # print report, always exit 0testwall accept
The merge gate. Runs verification, then unlocks files and cleans up the snapshot directory. Rejects if any tampering is detected.
testwall status
Show the current manifest: file count, lock state, snapshot presence, patterns, and test command.
Default patterns
testwall ships with patterns for common test conventions:
| Ecosystem | Patterns |
|-----------|----------|
| Python | test_*.py, *_test.py, tests/**/*.py, conftest.py |
| Rust | tests/**/*.rs |
| JS/TS | **/*.test.{js,ts,tsx}, **/*.spec.{js,ts,tsx} |
| Go | **/*_test.go |
| Java/Kotlin | src/test/**/*.java, src/test/**/*.kt |
| Config | pytest.ini, setup.cfg, jest.config.*, vitest.config.*, .cargo/config.toml |
Typical workflow
You (test author) testwall Agent (implementer)
───────────────── ──────── ───────────────────
Write tests
├──── testwall init ────►
├──── testwall lock ────►
│ Agent implements code
│ Agent tries to edit tests → DENIED
│ Agent runs testwall run
│ ◄──── tests execute from snapshot
│ Tests pass
├──── testwall accept ──►
│ ✓ checksums match
│ ✓ files unlocked
│ ✓ snapshot cleanedWhat it catches
- Weakened assertions (
assert x > 0→assert True) - Deleted test cases
- Modified test config (
conftest.py,jest.config.*) - Special-cased test inputs
- Swapped test runner flags
- Any byte-level change to snapshotted files
License
MIT
