dynobox
v0.5.0
Published
Cross-harness testing for multi-step agent and skill workflows
Downloads
1,181
Maintainers
Readme
dynobox
Cross-harness testing for multi-step agent and skill workflows.
Dynobox runs agent scenarios through local harnesses such as Claude Code and Codex, captures observable behavior, and evaluates assertions against what actually happened.
- Site: dynobox.xyz
- Docs: docs.dynobox.xyz
- GitHub: github.com/dynobox/dynobox
Install
npm install -g dynoboxThe selected harness executable must already be installed, authenticated, and
available on PATH.
Quick Start
Create a starter dyno file, then run it:
dynobox init
dynobox rundynobox init writes dynobox/example.dyno.mjs by default. dynobox run with
no argument discovers *.dyno.{mjs,js,ts,mts,yaml,yml} files recursively under
the current directory.
Scope a run to a directory or file:
dynobox run dynobox
dynobox run my-skill.dyno.yamlPick a harness at runtime when needed:
dynobox run --harness claude-code
dynobox run --harness codex
dynobox run --harness claude-code,codexRepeat each selected scenario/harness pair when you want a pass-rate signal:
dynobox run --harness claude-code,codex --iterations 5What You Can Assert
Dynobox supports assertions for:
- Tool calls with
tool.called(...)andtool.notCalled(...). - Shell command matchers with
equals,includes,startsWith, ormatches. - File tool path matchers such as
tool.called('read_file', {path: 'package.json'}). - Ordered tool-call sequences.
- Skill instruction loading.
- Work-directory artifacts.
- Harness transcript and final response text.
- HTTP requests from local child-process tools that honor proxy environment variables.
Output Modes
--quiet: compact dots-and-failures output for CI.--verbose: expand scenario details even when they pass.--debug: include work directory, artifact paths, and debug log paths.--reporter json: emit newline-delimited JSON reports.--iterations <count>: repeat each selected scenario/harness pair.--permission-mode default|dangerous: override harness permission behavior.
Auth
Use dynobox login to paste a dashboard-generated CLI token into local config,
then dynobox whoami to verify the saved identity. dynobox logout removes the
saved token. CLI tokens expire after 24 hours; when a token expires, run
dynobox login again to re-authenticate.
