empiric
v0.1.2
Published
A scientific-method CLI for local problem decomposition and software experiments.
Downloads
303
Readme
Empiric
Empiric is a local CLI for breaking coding problems into small parts and running software experiments with the scientific method.
It helps you:
- state coding goals,
- break them into atomic problem parts,
- keep a categorized backlog of deletion, simplification, rebuild, tuning, measurement, and automation ideas,
- audit components for owner, requirement, and deletion proof,
- enforce a deletion-first workflow gate before part-linked experiments,
- guide agents toward small local experiments,
- state falsifiable hypotheses,
- design minimal code-change experiments,
- run benchmark commands,
- capture speed and memory evidence,
- record accept/reject/inconclusive interpretations,
- keep an append-only logbook under
.empiric/.
Install locally
npm install
npm run build
npm linkYou can also run the local wrapper after building:
./tools/empiric --help
npx empiric --helpQuick start
empiric init
empiric problem add \
--title "Decide whether parser caching is worth building" \
--goal "Find the smallest evidence-backed parser caching approach." \
--context "Repeated parser calls may be slowing the hot path." \
--success "The agent knows whether to build, skip, or narrow the cache." \
--constraints "Do not change parser output or public APIs."
empiric part add \
--problem prob_20260429_example \
--title "Check for repeated inputs" \
--question "Do identical parser inputs repeat during one representative run?" \
--why "Caching only helps if repeated inputs are common enough." \
--success "A local probe reports repeat counts." \
--experiment "Print input hashes during the benchmark, then remove the probe." \
--owner "Parser owner" \
--requirement "Avoid repeated parser work only when it is observable." \
--deletion-proof "The no-cache baseline proves repeated parsing is not material."
empiric idea add \
--category delete \
--title "Remove receipt retention from hot path" \
--rationale "Retained receipt data may dominate memory under load." \
--expected-impact "Lower peak memory at the same TPS." \
--next "Delete retention and run the capped benchmark."
empiric audit add \
--part part_20260429_example \
--component "parser cache" \
--owner "Parser owner" \
--requirement "Only keep this if repeated inputs are common." \
--deletion-proof "Deleting it causes measurable repeated parser work." \
--step delete \
--status kept \
--next "Simplify the retained path."
empiric audit gate --part part_20260429_example
empiric experiment plan \
--part part_20260429_example \
--kind probe \
--title "Measure repeated parser inputs" \
--change "Add temporary input-hash logging around the parser." \
--benchmark "npm test -- --bench" \
--success "The command prints repeat-count evidence." \
--rollback "Remove the temporary logging."
empiric run --experiment exp_20260429_probe
empiric result --experiment exp_20260429_probe --status accepted --interpretation "Repeated inputs exist." --next "Build the smallest cache."
empiric part decide --part part_20260429_example --decision "Build a narrow parser cache." --evidence "The probe found repeated inputs." --next "Implement the smallest cache."
empiric goal
empiric status
empiric problem show --problem prob_20260429_example
empiric hypothesis add \
--title "Avoid repeated parsing" \
--statement "Caching parsed input will reduce benchmark duration without increasing max RSS by more than 5%." \
--speed "10% faster benchmark duration" \
--memory "no more than 5% higher max RSS" \
--mechanism "Parsing is repeated for identical inputs in the hot path." \
--assumptions "The parsed data is immutable during a run." \
--simpler-baseline "Measure the current parser without code changes." \
--must-be-true "Parser time must be a meaningful share of total runtime."
empiric experiment plan \
--hypothesis hyp_20260428_example \
--title "Cache parsed input by content hash" \
--change "Add a narrow in-memory cache around the parser." \
--benchmark "npm test -- --bench" \
--speed-target "10% faster duration" \
--memory-target "max RSS within 5%" \
--rollback "Remove the cache wrapper."
empiric logBenchmark RESULT ingestion
Empiric can parse harness output lines that begin with RESULT:
RESULT target_tps=40000 delivered=40000 dropped=0 avg_tps=39980 peak_tps=41000 memory_peak_mb=1830 oom=0 generator_failures=0 log=/tmp/run.logPipe saved output directly into an experiment:
benchmark-command | empiric result ingest --experiment exp_20260429_probe --from-stdinOr let empiric run capture and ingest RESULT lines automatically:
empiric run --experiment exp_20260429_probe --build-path ./build/nodeos --flags "2gb p2p"
empiric ceiling
empiric next
empiric ledger validatePromotion defaults for performance experiments are two clean runs with no dropped transactions, no generator failures, and no OOM events.
Agent workflow
- Create the problem with
empiric problem add. - Break it into atomic parts with
empiric part add. - Add backlog ideas with
empiric idea add --category delete|simplify|rebuild|tune|measure|automate. - Audit each questioned component with
empiric audit add, including owner, requirement, and deletion proof. - Pass
empiric audit gate --part ...before planning linked experiments. - Plan and run a linked experiment with
empiric experiment plan --part ...andempiric run. - Record a result, then decide the part with
empiric part decide. - Use
empiric goal,empiric status,empiric problem show, andempiric logto see what is known and what remains.
Deletion-first workflow gate
Empiric encodes a practical version of the Elon Musk algorithm:
- Question every requirement: every audited component needs a human owner and requirement.
- Delete the part or process: record the deletion proof that would justify keeping it.
- Simplify and optimize: only optimize what survived deletion.
- Accelerate cycle time: make the benchmark loop faster after the component survives.
- Automate last: automation ideas belong in the backlog only after the thing should exist.
Part-linked experiments are blocked until empiric audit gate --part ... passes. Use
--skip-gate only for deliberate exploratory work.
Empiric does not call AI models or edit application code. It gives agents a durable local structure for deciding what to try next.
Storage
Empiric writes readable local state into the target repository:
.empiric/config.json.empiric/problems/*.json.empiric/parts/*.json.empiric/hypotheses/*.json.empiric/ideas/*.json.empiric/audits/*.json.empiric/experiments/*.json.empiric/runs/*.json.empiric/log.mdEXPERIMENTS.mdledger rows for structured benchmark results
Empiric does not modify application code, create git commits, call remote services, or require a specific benchmark framework.
