@accelerate-data/promptfoo-eval-harness
v0.1.3
Published
Promptfoo + OpenCode eval harness for agent behavior. Owns model/tier policy, provider wiring, package discovery, state export, and artifact guards. Consumers own eval YAML, prompts, fixtures, and assertions.
Maintainers
Readme
@accelerate-data/promptfoo-eval-harness
Shared Promptfoo + OpenCode eval harness. Owns model/tier policy, provider wiring, package discovery, Promptfoo state export, and artifact guards. Consumers own eval YAML, prompts, fixtures, and assertions.
Quick Start
Bootstrap a new repo:
npx --package @accelerate-data/promptfoo-eval-harness eval-harness-initThis scaffolds tests/evals/ with package.json, opencode.json,
config/eval-tiers.toml, and a harness-smoke package, installs
dependencies, and adds a Dependabot entry to .github/dependabot.yml
so the repo receives PRs when a new version is released.
Verify the install:
cd tests/evals
npm test # contract tests
npm run doctor # print resolved paths
npm run eval:harness-smoke # one live execution
npm run eval:smoke # smoke across all packagesDependencies are installed automatically on the first ad-evals run
and re-installed whenever package-lock.json changes.
Usage
# Run the smoke filter across all packages
npm run eval:smoke
# Run all tests in all packages
npm run eval:regression
# Run one package
node bin/ad-evals.js run packages/my-feature/promptfooconfig.json
# Open the Promptfoo UI
npm run view
# Print resolved state paths
npm run doctorDocumentation
| Doc | What it covers | | --- | --- | | Setup Guide | Bootstrap, verify, write a package, run evals, wire CI — give this to a coding agent | | Design | Framework architecture and ownership boundary |
What the Framework Owns
- CLI entrypoint (
ad-evals) - Bootstrap (
eval-harness-init) - Path resolution across worktrees
- Promptfoo and OpenCode environment export
- Package discovery rules
- Provider wiring
- Resolved config materialization
- Artifact cleanup guard
- Default tier → agent mapping
What Your Repo Owns
opencode.jsonagent definitions (model, steps, permissions)- Package configs under
packages/<name>/ - Prompts, fixtures, vars
- Domain assertions
- Scenario inventory and per-package documentation
Contributing
See CONTRIBUTING.md.
