@jaltez/opencode-autoresearch
v0.1.2
Published
OpenCode Autoresearch loop, benchmark, dashboard, and finalize plugin.
Downloads
411
Maintainers
Readme
OpenCode Autoresearch
OpenCode Autoresearch is an OpenCode plugin for running benchmark-driven experiment loops. It gives an agent a structured way to create a session, run the canonical benchmark, parse METRIC output, keep or discard changes, preserve experiment memory, and split kept work into review branches.
This project is inspired by pi-autoresearch, but it is implemented as an OpenCode server and TUI plugin. The compatibility target is user-visible behavior and invariants rather than the original internal storage format.
Status
- Runtime: Bun and TypeScript.
- NPM plugin package:
@jaltez/opencode-autoresearch. - Internal subpath exports:
@jaltez/opencode-autoresearch/serverand@jaltez/opencode-autoresearch/tui. - Session source of truth:
autoresearch.jsonlplus generatedautoresearch.state.json. - Current validation command:
bun run check.
Install And Build
bun install
bun run build
bun run checkThe package exports built files from dist, so run bun run build before packing or loading the package from a built artifact. prepack also runs the build automatically.
Install In OpenCode
Install into the current project:
cd /path/to/project
opencode plugin @jaltez/opencode-autoresearchInstall globally:
opencode plugin -g @jaltez/opencode-autoresearchOpenCode writes the plugin spec into the matching config scope:
- Project install updates
.opencode/opencode.jsonand.opencode/tui.json. - Global install updates
~/.config/opencode/opencode.jsoncand~/.config/opencode/tui.json.
If you are installing a very fresh publish and Bun is enforcing a package age policy, prefix the install command with npm_config_min_release_age=0:
npm_config_min_release_age=0 opencode plugin @jaltez/opencode-autoresearch
npm_config_min_release_age=0 opencode plugin -g @jaltez/opencode-autoresearchRemove From OpenCode
OpenCode does not currently expose a dedicated plugin uninstall command. Remove the plugin spec from the same config files that were updated during install:
- Project install: remove
@jaltez/opencode-autoresearchfrom.opencode/opencode.jsonand.opencode/tui.json. - Global install: remove
@jaltez/opencode-autoresearchfrom~/.config/opencode/opencode.jsoncand~/.config/opencode/tui.json.
If you loaded the plugin from local files instead of npm, delete the corresponding file from .opencode/plugins/ or ~/.config/opencode/plugins/.
After removing the config entry or local plugin file, restart OpenCode so it reloads the plugin list.
OpenCode Entry Points
Configure OpenCode to load the published package when you want the server tools plus the dashboard UI:
{
"plugin": [
"@jaltez/opencode-autoresearch"
]
}For local development from a checkout, point OpenCode at the package directory instead:
{
"plugin": [
"file:."
]
}OpenCode installs npm plugins by package name. Npm subpath entries such as @jaltez/opencode-autoresearch/server and @jaltez/opencode-autoresearch/tui are package exports, but they are not valid values in opencode.json.
The server plugin injects an autoresearch agent and these commands:
autoresearchfor status, pause, resume, backup, restore, export, clear, and mode control.autoresearch-createfor scaffold creation.autoresearch-finalizefor review branch planning and creation.autoresearch-hooksfor hook scaffolding.
The TUI plugin adds a sidebar summary, prompt status, dashboard route, and command palette actions for common control operations.
Session Files
Autoresearch keeps its own state in the target workspace or configured work directory:
autoresearch.jsonl: append-only session log.autoresearch.state.json: regenerated state snapshot.autoresearch.md: session objective and rules.autoresearch.ideas.md: backlog for deferred hypotheses.autoresearch.sh: canonical benchmark entrypoint.autoresearch.checks.sh: optional backpressure checks.autoresearch.config.json: optional runtime config such asmaxIterations.autoresearch.hooks/before.shandautoresearch.hooks/after.sh: optional executable hooks..autoresearch.backups: managed backups for recovery.
When autoresearch.sh exists, run_experiment requires the canonical entrypoint. Harmless wrappers such as env, time, nice, nohup, and bash are accepted; shell chaining and ad hoc benchmark commands are rejected.
Experiment Loop
- Create or initialize a session with
autoresearch-createorinit_experiment. - Run the benchmark with
run_experiment. - Emit metrics from the benchmark using
METRIC name=value, optionally followed by a unit or direction. - Run configured checks or
autoresearch.checks.sh. - Decide with
log_experiment:keep,discard,retry, orpending. - Let auto-resume continue while the session is active and below
maxIterations.
Examples of accepted metric output:
METRIC accuracy=0.91 higher
METRIC latency_ms=120 ms lower
METRIC total_us=15200 us lowerBenchmark timeouts default to 600 seconds and are recorded as crashed with exit code 124. Check timeouts default to 300 seconds and are recorded as checks_failed with exit code 124. Both can be overridden per run with timeout_seconds and checks_timeout_seconds.
Kept runs are committed when the work directory is inside a git repository. Commit messages include JSON trailers:
Autoresearch-Result: {"runId":"...","iteration":1,"status":"kept","decision":"keep"}
Autoresearch-Metrics: [{"name":"accuracy","value":0.91,"higherIsBetter":true}]If git is unavailable, keep decisions are still recorded without a commit. If git commit fails, the run remains pending so the issue can be fixed and retried.
Hooks
Hooks are optional executable scripts. Non-executable hook files are skipped.
beforeruns before the benchmark command.afterruns afterlog_experimentapplies the decision and git action.
Hooks receive JSON on stdin with the event name, cwd, session snapshot, and run details when available. Hook stdout and stderr are capped at 8KB with UTF-8-safe truncation. JSON stdout can provide a concise message field for agent steering.
Finalize
autoresearch-finalize groups kept runs by overlapping changed files. With branch creation enabled, it creates review branches from the parent of the oldest kept commit, applies each group, commits it, verifies no autoresearch artifacts leaked, and checks that the union of created branches matches the final autoresearch branch.
Finalize handles modified, added, renamed, and deleted files. Dirty worktree state is stashed before branch creation and restored afterward.
Durability And Recovery
Autoresearch blocks loop mutations when the source-of-truth JSONL is missing or invalid. Use:
autoresearch statusto inspect durability warnings.autoresearch backupto preserve current artifacts.autoresearch backupsto list saved backups.autoresearch restoreto restore the latest or requested backup.autoresearch exportto writeautoresearch.dashboard.html, including recovery warnings.
Backups are kept under .autoresearch.backups and are preserved by destructive clears.
Development
bun run typecheck
bun test
bun run checkKeep changes focused and add tests around any behavior that affects session durability, git operations, hook execution, metrics parsing, or auto-resume semantics.
License
MIT License. See LICENSE.
