shll-agent-runner

v1.0.1

Published

3 months ago

SHLL Agent Runtime - Automated execution service for Agent NFAs

Downloads

0High
0Medium
0Low

kled7

SHLL Runner

Deterministic agent runner for SHLL autonomous execution.

What It Does

Runs agent cycles with a fixed runtime pipeline:
- observe -> propose -> plan -> validate -> simulate -> execute -> verify -> record
Enforces hard/soft guardrails in code (not only prompt).
Rejects low-confidence non-wait actions with runtime confidence gate.
Writes structured run records (failureCategory, errorCode, executionTrace).
Supports shadow mode for canary comparison without affecting on-chain state.

Target Architecture

Runner now exposes the main target-architecture building blocks instead of relying on prompt-only recovery and text inference:

TurnExecutionContext
- tracks active model, tool mode, fallback attempts, and quality-recovery attempts for one turn
TurnOrchestrator
- owns initial generation, fallback-model retry, low-quality retry, and quality-gate recovery
DecisionEvidence
- attaches structured tool/evidence summaries to decisions so quality checks do not depend only on regex over free text
waitKind / outcomeKind
- makes wait classification explicit (quality_recovery, cadence, infra_retry, etc.) so runtime/scheduler can stop guessing from reasoning
persisted decision metadata
- decision logs now retain wait kind, outcome kind, evidence, and turn context snapshots for replay/debugging

Quick Start

Install dependencies:
- npm install
Configure environment:
- copy .env.example to .env
- fill required vars (RPC_URL, OPERATOR_PK, AGENT_NFA_ADDRESS, DB config, etc.)
Run:
- npm run dev
Verify build/tests:
- npm run test:replay

Production Minimum Env

Minimum required variables for production runtime:

RPC_URL
CHAIN_ID
OPERATOR_PK
AGENT_NFA_ADDRESS
DATABASE_URL (or PGHOST/PGPORT/PGUSER/PGPASSWORD/PGDATABASE)
API_KEY (recommended non-empty)

Recommended baseline values:

LOG_LEVEL=info
POLL_INTERVAL_MS=30000
TOKEN_LOCK_LEASE_MS=90000
MAX_RETRIES=3
PG_POOL_MAX=10
MAX_RUN_RECORDS=1000
STATUS_RUNS_LIMIT=20
LLM_MIN_ACTION_CONFIDENCE=0.45

Confidence gate behavior:

Applied only when decision.action !== "wait".
If confidence < LLM_MIN_ACTION_CONFIDENCE, runner blocks the action with:
- failureCategory=model_output_error
- errorCode=MODEL_LOW_CONFIDENCE

Shadow Mode (Phase 4)

Use shadow mode to compare current planner behavior vs legacy planner behavior.

Environment Flags

SHADOW_MODE_ENABLED=true|false
SHADOW_MODE_TOKEN_IDS=1,2,3 (optional; empty means all schedulable tokens)
SHADOW_EXECUTE_TX=false|true
- false: dry-run only, no on-chain transaction submission
- true: allow on-chain submit in shadow mode (use with caution)

Stored Fields

Each run now includes:

run_mode: primary or shadow
shadow_compare: structured comparison between primary and legacy plans

Metrics API

GET /shadow/metrics
Optional query:
- tokenId=<uint>
- sinceHours=<1..720>

Response contains per-mode metrics:

total/success/rejected/exception/intervention counts
success/reject/exception/intervention/divergence rates
average end-to-end latency derived from execution trace
shadow-vs-primary rate deltas

Core API Endpoints

GET /health
GET /status
GET /status/all
GET /agent/dashboard
GET /agent/activity
GET /shadow/metrics
POST /enable
POST /disable
POST /strategy/upsert
POST /strategy/clear-goal

Validation and Replay

npm run test:replay
- compiles TypeScript
- runs replay classifier tests
- runs params validator tests
- runs run failure classifier tests
- runs planner tests

Compatibility

This runner-side refactor keeps chain interface behavior intact and remains compatible with BAP-578-based agent assets/contracts.

External Skills (No-Code Readonly Extension)

Runner supports dynamic readonly tool registration from an external skill catalog. This allows adding new data/analysis skills without changing runner source code.

Environment:

RUNNER_EXTERNAL_SKILLS_DB_ENABLED - default true; load catalog from DB first
RUNNER_EXTERNAL_SKILLS_PATH - absolute/relative path to a JSON catalog file (DB fallback #1)
RUNNER_EXTERNAL_SKILLS_JSON - inline JSON catalog string (DB fallback #2)
RUNNER_EXTERNAL_SKILL_AUTO_BIND - default true; auto-attach loaded external skills to selected agent types
RUNNER_EXTERNAL_SKILL_BIND_AGENT_TYPES - comma list, default meme_trader,llm_trader,llm_defi, supports *
RUNNER_LLM_TOOL_RESULT_MAX_CHARS - max JSON chars returned per tool call to LLM (default 6000) to prevent context overflow

Catalog source priority:

DB table external_skill_catalogs (when enabled and non-empty)
RUNNER_EXTERNAL_SKILLS_PATH
RUNNER_EXTERNAL_SKILLS_JSON

Admin API (requires /v3/admin/* auth):

GET /v3/admin/external-skills - list DB catalog
GET /v3/admin/external-skills/:name - get one DB catalog row
PUT /v3/admin/external-skills/:name - upsert one DB catalog row
DELETE /v3/admin/external-skills/:name - delete one DB catalog row
POST /v3/admin/external-skills/reload - reload catalog to runtime (no restart)

Catalog item shape:

[
  {
    "name": "binance_token_dynamic_info",
    "description": "Get token dynamic market data from Binance Web3 endpoint",
    "endpoint": "https://web3.binance.com/bapi/defi/v4/public/wallet-direct/buw/wallet/market/token/dynamic/info",
    "method": "GET",
    "headers": {
      "Accept-Encoding": "identity"
    },
    "parameters": {
      "type": "object",
      "properties": {
        "chainId": { "type": "string", "description": "56 or CT_501" },
        "contractAddress": { "type": "string", "description": "token contract address" }
      },
      "required": ["chainId", "contractAddress"]
    },
    "requestPlacement": "query",
    "extractPath": "data"
  }
]

Notes:

External skills are readonly only (encoded as no-op payload), so write-path safety remains unchanged.
Header values support env placeholders: ${ENV:MY_API_KEY}.
For GET, params are sent as query string; for POST, params are sent as JSON body.
External skill names cannot override existing built-in actions; duplicate names are skipped at bootstrap.
This collision check also applies among external skills loaded in one startup batch.
Catalog validation is strict for method (GET/POST) and requestPlacement (query/json).
When auto-bind is enabled, newly loaded external skills become available to configured agent types without manual blueprint edits.
DB schema:
- table: external_skill_catalogs
- required columns: skill_name, description, endpoint, parameters
- optional columns: method, timeout_ms, headers, request_placement, extract_path, enabled
SQL example (upsert one skill):

INSERT INTO external_skill_catalogs (
  skill_name, description, endpoint, method, parameters, request_placement, extract_path, enabled, updated_at
) VALUES (
  'binance_token_dynamic_info',
  'Get dynamic token market data from Binance Web3 API',
  'https://web3.binance.com/bapi/defi/v4/public/wallet-direct/buw/wallet/market/token/dynamic/info',
  'GET',
  '{"type":"object","properties":{"chainId":{"type":"string","description":"56 or CT_501"},"contractAddress":{"type":"string","description":"token contract"}},"required":["chainId","contractAddress"]}'::jsonb,
  'query',
  'data',
  TRUE,
  NOW()
) ON CONFLICT (skill_name) DO UPDATE SET
  description = EXCLUDED.description,
  endpoint = EXCLUDED.endpoint,
  method = EXCLUDED.method,
  parameters = EXCLUDED.parameters,
  request_placement = EXCLUDED.request_placement,
  extract_path = EXCLUDED.extract_path,
  enabled = EXCLUDED.enabled,
  updated_at = NOW();

Binance Web3 starter catalog: docs/external-skills.binance-web3.example.json