helixevo
v0.10.0
Published
Co-evolving skill and project brain for AI agents, with automatic theory-conformance verification, reliable provider-aware actuation, ontology-aware learning, governed response, rollbackable topology control, bounded proof-governed steering, and a premium
Downloads
5,600
Maintainers
Readme
HelixEvo
Co-evolving skill and project brain for AI agents. HelixEvo captures failures, traces activations, models pressure, routes governed responses, promotes cross-project transfer, reviews structural topology changes, safely executes accepted topology transitions with rollback, lets approved ontology concepts become active semantic consumers inside the live control loop, turns Proof into a bounded steering input for future control, and now adds automatic theory-conformance verification through contract-backed scenarios plus bounded live smoke checks.
How it works
HelixEvo builds on ideas from EvoSkill and AutoResearch to create a three-directional evolution system:
- Generalize ↑ — Detect cross-project patterns and promote them to abstract skills
- Specialize ↓ — Create project-specific skills from domain skills + project failures
- Lateral ↔ — Merge, split, and resolve conflicts between skills
Every proposed change goes through:
- 3 independent LLM judges (Task Completion, Correction Alignment, Side-Effect Check)
- Regression testing against skill tests
- 3-day canary deployment with auto-rollback
Prerequisites
- Node.js 18+
- Bun — used for building (
curl -fsSL https://bun.sh/install | bash) - Claude CLI — installed and authenticated
- Requires a Claude Max plan subscription
- Claude Code remains the default provider for HelixEvo
- Prefer
claude auth loginmanaged credentials over exporting a hardcodedCLAUDE_CODE_OAUTH_TOKEN - HelixEvo now retries once without an inherited
CLAUDE_CODE_OAUTH_TOKENif that override is stale but local Claude auth is valid
- Optional providers
- Codex CLI (
codex) for GPT Codex on shared prompt-in / text-out paths - Ollama (
ollama+ local daemon) for shared local-model prompt-in / text-out paths - Claude-only web-search and research tooling remain explicitly Claude-scoped
- Codex CLI (
Verify prerequisites:
node --version # v18+
bun --version # any
claude --version # default provider
codex --version # optional
ollama --version # optionalInstall
From npm (recommended)
npm install -g helixevoFrom GitHub
npm install -g github:danielchen26/helixevoFrom source
git clone https://github.com/danielchen26/helixevo.git
cd helixevo
npm install
npm run build
npm linkQuick Start
# 1. Initialize — imports existing skills + generates skill tests
helixevo init
# 2. Capture failures from a session
helixevo capture path/to/session.json --project myapp
# 3. Evolve skills from failures
helixevo evolve --verbose
# 4. View the skill network
helixevo graph
# 5. Open the web dashboard
helixevo dashboardCommands
| Command | Description |
|---------|-------------|
| helixevo watch | Always-on learning: auto-capture + auto-evolve |
| helixevo metrics | Correction rates, skill trends, evolution impact |
| helixevo proof | Outcome attribution, proof review, and steering summaries across interventions, transfer, topology, ontology, and evolution |
| helixevo verify-brain | Automatic theory-conformance runner across deterministic scenarios plus bounded live smoke checks |
| helixevo health | Network health: cohesion, coverage, balance, transfer |
| helixevo init | Import existing skills + generate skill tests |
| helixevo capture <session> | Extract failures from a session file |
| helixevo project-setup <path> | Analyze a project, match skills, and surface capability gaps |
| helixevo evolve | Evolve skills from captured failures |
| helixevo generalize | Promote cross-project patterns ↑ |
| helixevo specialize --project <name> | Create project-specific skills ↓ |
| helixevo graph | View skill network in terminal |
| helixevo ontology | Refresh, review, adopt, and inspect ontology concepts plus semantic control coverage |
| helixevo topology | Prepare, apply, roll back, and inspect reviewed topology execution |
| helixevo research | Proactive web research for skill improvement (Claude-scoped web-tool path) |
| helixevo dashboard [--port <n>] | Open web dashboard, preferring localhost:3847 and falling forward if occupied |
| helixevo status | Show system health plus provider-control truth |
| helixevo report | Generate evolution report |
Common options
Most commands support:
--dry-run— Preview changes without applying--verbose— Show detailed LLM interactions
Graph options
helixevo graph # TUI view (instant, cached)
helixevo graph --mermaid # Open in browser as Mermaid diagram
helixevo graph --obsidian ~/vault # Sync to Obsidian vault
helixevo graph --rebuild # Re-infer relationships (LLM call)
helixevo graph --optimize # Refresh topology review queue first, then report full vs partial conflict enrichment
helixevo ontology --status # Show ontology kernel / frontier / extension / adoption state
helixevo ontology --status --verbose
# Show top active concepts, unused extensions, and deprecation-sensitive concepts
helixevo ontology --refresh # Derive frontier concepts from recurring evidence
helixevo ontology --review <id> --decision promote
# Promote a reviewed frontier concept into approved extensions
helixevo topology --status # Show reviewed topology execution state
helixevo topology --prepare <id> # Prepare an accepted topology candidate
helixevo topology --apply <id> # Apply a safe prepared topology plan
helixevo topology --rollback <id> # Roll back an applied topology plan
helixevo proof --status # Review proof state across the live loop
helixevo proof --review <id> --decision verify
# Verify a proof record after operator review
helixevo verify-brain --verbose # Run the contract-backed brain verification workflow
helixevo verify-brain --release # Run stricter release-grade conformance handlingResearch options
helixevo research --verbose # Full output
helixevo research --project ./myapp # Focus research on a project
helixevo research --max-hypotheses 5 # Test more hypotheses
helixevo research --dry-run # Preview without creating skillsData
All data is stored in ~/.helix/:
~/.helix/
├── config.json # Configuration
├── failures.jsonl # Captured failures
├── activation-traces.jsonl # Native + derived activation traces
├── pressure-signals.jsonl # Native + derived adaptation pressure
├── pressure-interventions.jsonl # Routed intervention ledger across response lanes
├── transfer-events.jsonl # Promotion / transfer evidence across motifs and projects
├── governance-state.json # Operator steering for active governance mode
├── llm-runtime-state.json # Default provider, per-provider health, last execution, and fallback truth
├── topology-review-candidates.json # Persisted structural review queue
├── topology-review-decisions.jsonl # Operator accept/reject/defer decision ledger
├── topology-optimize-status.json # Last full/partial optimize refresh status + queue/enrichment summary
├── topology-overrides.json # Applied safe structural topology overrides
├── topology-snapshots.json # Snapshot refs for reviewed execution and rollback
├── topology-apply-plans.json # Prepared reviewed topology plans
├── topology-executions.jsonl # Prepared/applied/rolled-back execution ledger
├── topology-artifacts.jsonl # Evidence artifacts for reviewed structural execution
├── proof-reviews.jsonl # Operator verify/defer/contest ledger for derived proof records
├── evolution-artifacts.jsonl # Evolution + ontology-review evidence artifacts
├── theory-conformance/
│ ├── latest.json # Latest contract-backed brain verification result
│ ├── reports/ # Human-readable theory-conformance reports
│ └── runs/ # Per-run scenario artifacts and structured outputs
├── ontology/
│ ├── kernel.json # Materialized ontology kernel snapshot
│ ├── extensions.json # Approved ontology extensions
│ ├── frontier.json # Provisional frontier concepts awaiting review
│ ├── reviews.jsonl # Ontology review decisions
│ └── change-log.jsonl # Native ontology change events
├── frontier.json # Pareto frontier (top-k configurations)
├── evolution-history.json # All evolution runs + proposals
├── skill-tests.jsonl # Regression test cases
├── skill-graph.json # Cached network (nodes + edges + ontology version)
├── canary-registry.json # Active canary deployments
├── knowledge-buffer.json # Research discoveries + drafts
├── general/ # Skills (SKILL.md files)
│ ├── my-skill/SKILL.md
│ └── ...
├── backups/ # Pre-canary skill backups
└── reports/ # Generated reportsWeb Dashboard
The dashboard provides an interactive view of your skill ecosystem:
helixevo dashboard
# Prefers http://localhost:3847 and falls forward if that port is occupied
helixevo dashboard --port 3900
# Prefer port 3900 firstTabs:
- Overview — Premium control cockpit with frontier signals, brain foundation, provider-control truth, semantic backbone, ontology adoption visibility, proof review visibility, pressure counts, topology review visibility, and prepared/applied structural state
- Skill Network — Interactive graph, premium inspector, co-evolution routing signals, and topology review/execution handoff links
- Co-Evolution — Operator cockpit for routed pressure response, governance mode visibility, promotion queues, transfer evidence, semantic route influence, proof-aware route rationale, and topology handoff
- Ontology — Semantic control surface for kernel visibility, frontier concept review, approved ontology extensions, adoption coverage, deprecation risk, and native ontology change events
- Topology — Governance steering plus a persistent operator pipeline for review → prepare → apply → rollback across merge / split / promote / rewire / consolidate candidates
- Proof — Outcome-attribution, review, and proof-steering cockpit for bounded effectiveness across interventions, transfer, topology execution, semantic adoption, and evolution impact
- Projects — Project intake studio, live project analysis, gap routing, per-project pressure hotspots, and promotion feeders
- Evolution — Timeline of evolution runs with judge scores, artifact provenance, and activation-aware context
- Research — Knowledge buffer plus a live “why research now” handoff from current pressure, governed routing, and recurring gaps
- Frontier — Pareto frontier with 4-dimension scores + canary status
The dashboard requires Next.js dependencies. On first run:
cd dashboard && npm installCraft Agent Integration
HelixEvo includes a Craft Agent skill at integrations/craft-agent/:
# Copy to your skills directory
cp -r integrations/craft-agent/skills/skill-evolver ~/.agents/skills/Then use [skill:skill-evolver] in Craft Agent to trigger evolution.
Architecture
Failures → Cluster → Propose → Replay → Multi-Judge → Regression → Canary → Frontier
│ │
│ 3 independent judges:
│ - Task Completion
│ - Correction Alignment
│ - Side-Effect Check
│
Knowledge Buffer
(discoveries + drafts from rejected proposals)Brain foundation:
- Ontology defines the stable semantic kernel for skills, projects, tasks, capabilities, artifacts, and mutations.
- Ontology frontier and extensions let new semantic concepts emerge as provisional hypotheses, pass explicit review, become approved extensions, and then appear as active semantic consumers in pressure, routing, transfer, and structural interpretation without free-form drift.
- Semantic adoption visibility shows which approved concepts are unused, active, deprecation-sensitive, or currently influencing live route rationale.
- Activation traces record which skills and gaps were active during capture and project analysis.
- Pressure signals turn failures and project gaps into explicit adaptation demand.
- Pressure interventions record how HelixEvo responded across research, specialize, evolve, generalize, and manual-review lanes.
- Governed routing and transfer evidence let recurring multi-project motifs bias toward promotion and show when reusable knowledge was actually realized.
- Governance steering lets the operator pin or release the active adaptation mode rather than relying only on derived routing.
- Topology review persists merge / split / promote / rewire / consolidate candidates so manual review is a real workflow.
- Reviewed topology execution turns accepted safe candidates into prepared plans, snapshot-backed applies, and rollbackable structural transitions.
- Proof control turns bounded outcome attribution into an explicit operator layer where interventions, transfer, topology execution, semantic adoption, and evolution impact can be verified, deferred, or contested, then fed back into future control through bounded proof steering.
- Evolution artifacts preserve proposal-level evidence so the dashboard can show what changed, why, and with what provenance.
Three-layer hierarchy:
- System — Global agent behaviors
- Domain — Cross-project patterns (generalized skills)
- Project — Project-specific specializations
License
MIT
