shll-agent-runner
v1.0.1
Published
SHLL Agent Runtime - Automated execution service for Agent NFAs
Readme
SHLL Runner
Deterministic agent runner for SHLL autonomous execution.
What It Does
- Runs agent cycles with a fixed runtime pipeline:
observe -> propose -> plan -> validate -> simulate -> execute -> verify -> record
- Enforces hard/soft guardrails in code (not only prompt).
- Rejects low-confidence non-wait actions with runtime confidence gate.
- Writes structured run records (
failureCategory,errorCode,executionTrace). - Supports shadow mode for canary comparison without affecting on-chain state.
Target Architecture
Runner now exposes the main target-architecture building blocks instead of relying on prompt-only recovery and text inference:
TurnExecutionContext- tracks active model, tool mode, fallback attempts, and quality-recovery attempts for one turn
TurnOrchestrator- owns initial generation, fallback-model retry, low-quality retry, and quality-gate recovery
DecisionEvidence- attaches structured tool/evidence summaries to decisions so quality checks do not depend only on regex over free text
waitKind/outcomeKind- makes wait classification explicit (
quality_recovery,cadence,infra_retry, etc.) so runtime/scheduler can stop guessing fromreasoning
- makes wait classification explicit (
- persisted decision metadata
- decision logs now retain wait kind, outcome kind, evidence, and turn context snapshots for replay/debugging
Quick Start
- Install dependencies:
npm install
- Configure environment:
- copy
.env.exampleto.env - fill required vars (
RPC_URL,OPERATOR_PK,AGENT_NFA_ADDRESS, DB config, etc.)
- copy
- Run:
npm run dev
- Verify build/tests:
npm run test:replay
Production Minimum Env
Minimum required variables for production runtime:
RPC_URLCHAIN_IDOPERATOR_PKAGENT_NFA_ADDRESSDATABASE_URL(orPGHOST/PGPORT/PGUSER/PGPASSWORD/PGDATABASE)API_KEY(recommended non-empty)
Recommended baseline values:
LOG_LEVEL=infoPOLL_INTERVAL_MS=30000TOKEN_LOCK_LEASE_MS=90000MAX_RETRIES=3PG_POOL_MAX=10MAX_RUN_RECORDS=1000STATUS_RUNS_LIMIT=20LLM_MIN_ACTION_CONFIDENCE=0.45
Confidence gate behavior:
- Applied only when
decision.action !== "wait". - If
confidence < LLM_MIN_ACTION_CONFIDENCE, runner blocks the action with:failureCategory=model_output_errorerrorCode=MODEL_LOW_CONFIDENCE
Shadow Mode (Phase 4)
Use shadow mode to compare current planner behavior vs legacy planner behavior.
Environment Flags
SHADOW_MODE_ENABLED=true|falseSHADOW_MODE_TOKEN_IDS=1,2,3(optional; empty means all schedulable tokens)SHADOW_EXECUTE_TX=false|truefalse: dry-run only, no on-chain transaction submissiontrue: allow on-chain submit in shadow mode (use with caution)
Stored Fields
Each run now includes:
run_mode:primaryorshadowshadow_compare: structured comparison between primary and legacy plans
Metrics API
GET /shadow/metrics- Optional query:
tokenId=<uint>sinceHours=<1..720>
Response contains per-mode metrics:
- total/success/rejected/exception/intervention counts
- success/reject/exception/intervention/divergence rates
- average end-to-end latency derived from execution trace
- shadow-vs-primary rate deltas
Core API Endpoints
GET /healthGET /statusGET /status/allGET /agent/dashboardGET /agent/activityGET /shadow/metricsPOST /enablePOST /disablePOST /strategy/upsertPOST /strategy/clear-goal
Validation and Replay
npm run test:replay- compiles TypeScript
- runs replay classifier tests
- runs params validator tests
- runs run failure classifier tests
- runs planner tests
Compatibility
This runner-side refactor keeps chain interface behavior intact and remains compatible with BAP-578-based agent assets/contracts.
External Skills (No-Code Readonly Extension)
Runner supports dynamic readonly tool registration from an external skill catalog. This allows adding new data/analysis skills without changing runner source code.
Environment:
RUNNER_EXTERNAL_SKILLS_DB_ENABLED- defaulttrue; load catalog from DB firstRUNNER_EXTERNAL_SKILLS_PATH- absolute/relative path to a JSON catalog file (DB fallback #1)RUNNER_EXTERNAL_SKILLS_JSON- inline JSON catalog string (DB fallback #2)RUNNER_EXTERNAL_SKILL_AUTO_BIND- defaulttrue; auto-attach loaded external skills to selected agent typesRUNNER_EXTERNAL_SKILL_BIND_AGENT_TYPES- comma list, defaultmeme_trader,llm_trader,llm_defi, supports*RUNNER_LLM_TOOL_RESULT_MAX_CHARS- max JSON chars returned per tool call to LLM (default6000) to prevent context overflow
Catalog source priority:
- DB table
external_skill_catalogs(when enabled and non-empty) RUNNER_EXTERNAL_SKILLS_PATHRUNNER_EXTERNAL_SKILLS_JSON
Admin API (requires /v3/admin/* auth):
GET /v3/admin/external-skills- list DB catalogGET /v3/admin/external-skills/:name- get one DB catalog rowPUT /v3/admin/external-skills/:name- upsert one DB catalog rowDELETE /v3/admin/external-skills/:name- delete one DB catalog rowPOST /v3/admin/external-skills/reload- reload catalog to runtime (no restart)
Catalog item shape:
[
{
"name": "binance_token_dynamic_info",
"description": "Get token dynamic market data from Binance Web3 endpoint",
"endpoint": "https://web3.binance.com/bapi/defi/v4/public/wallet-direct/buw/wallet/market/token/dynamic/info",
"method": "GET",
"headers": {
"Accept-Encoding": "identity"
},
"parameters": {
"type": "object",
"properties": {
"chainId": { "type": "string", "description": "56 or CT_501" },
"contractAddress": { "type": "string", "description": "token contract address" }
},
"required": ["chainId", "contractAddress"]
},
"requestPlacement": "query",
"extractPath": "data"
}
]Notes:
- External skills are readonly only (encoded as no-op payload), so write-path safety remains unchanged.
- Header values support env placeholders:
${ENV:MY_API_KEY}. - For
GET, params are sent as query string; forPOST, params are sent as JSON body. - External skill names cannot override existing built-in actions; duplicate names are skipped at bootstrap.
- This collision check also applies among external skills loaded in one startup batch.
- Catalog validation is strict for
method(GET/POST) andrequestPlacement(query/json). - When auto-bind is enabled, newly loaded external skills become available to configured agent types without manual blueprint edits.
- DB schema:
- table:
external_skill_catalogs - required columns:
skill_name,description,endpoint,parameters - optional columns:
method,timeout_ms,headers,request_placement,extract_path,enabled
- table:
- SQL example (upsert one skill):
INSERT INTO external_skill_catalogs (
skill_name, description, endpoint, method, parameters, request_placement, extract_path, enabled, updated_at
) VALUES (
'binance_token_dynamic_info',
'Get dynamic token market data from Binance Web3 API',
'https://web3.binance.com/bapi/defi/v4/public/wallet-direct/buw/wallet/market/token/dynamic/info',
'GET',
'{"type":"object","properties":{"chainId":{"type":"string","description":"56 or CT_501"},"contractAddress":{"type":"string","description":"token contract"}},"required":["chainId","contractAddress"]}'::jsonb,
'query',
'data',
TRUE,
NOW()
) ON CONFLICT (skill_name) DO UPDATE SET
description = EXCLUDED.description,
endpoint = EXCLUDED.endpoint,
method = EXCLUDED.method,
parameters = EXCLUDED.parameters,
request_placement = EXCLUDED.request_placement,
extract_path = EXCLUDED.extract_path,
enabled = EXCLUDED.enabled,
updated_at = NOW();- Binance Web3 starter catalog: docs/external-skills.binance-web3.example.json
