@gxcloud/agent-forge

v0.5.0

Published

6 days ago

A tiny operating system for coding agents

0High
0Medium
0Low

gaussag

agent cli mcp task-management coding-agent ai

agent-forge

A tiny CLI operating system for coding agents. Manages tasks, decisions, scratchpads, validation, and packages — all from the command line with compact JSON output.

npm install -D @gxcloud/agent-forge

Quick Start

# Initialize the .agent/ directory and SQLite database
npx agent-forge init

# Create a task
npx agent-forge task create '{"title":"Add auth","goal":"Implement login","acceptanceCriteria":["Users can log in"],"validation":["npm test"]}'

# Check what's next
npx agent-forge task current

# Use the scratchpad to track investigation state
npx agent-forge scratchpad set '{"objective":"Implement login","hypothesis":"Bug in router.ts","findings":[{"description":"Missing validation","confidence":"high"}],"plannedActions":["Add validation middleware"]}'

# Run validation
npx agent-forge test

# Mark complete (auto-commits to git, clears scratchpad)
npx agent-forge task complete <task_id>

# Record decisions
npx agent-forge decision add '{"title":"Use SQLite","context":"Need persistence","decision":"Use SQLite","reason":"Simple","tags":["db"]}'

Init

npx agent-forge init

Creates the following structure in the project root:

.agent/
  agent.sqlite       # SQLite database (tasks, decisions, scratchpads, validation runs)
  config.json        # Project configuration
AGENTS.md            # Instructions for coding agents
.opencode.json       # MCP server config for OpenCode
.claude/settings.json # MCP server config for Claude Code
.codex/settings.json # MCP server config for Codex CLI
.gitignore           # .agent/ entry added if not present

init is idempotent — running it again does not overwrite existing files or data. Database tables are created with CREATE TABLE IF NOT EXISTS. A .gitignore entry for .agent/ is created if not already present.

Commands

| Command | Description | |---|---| | init | Initialize .agent/ directory and database | | task create <json> | Create a task | | task current | Show current active task | | task list | List all tasks | | task read <id> | Read task details | | task check <id> | Check completion readiness | | task complete <id> [--no-commit] | Complete task (auto-commits) | | task update <id> <json> | Update task fields (partial) | | task set-status <id> <status> | Set task status | | task search <query> | Search tasks by title or goal | | decision add <json> | Record an architecture decision | | decision search [--query] [--tags] | Search decisions | | decision read <id> | Read a decision | | scratchpad set <json> | Set/update scratchpad for current task | | scratchpad read [id] | Read scratchpad (current task or by id) | | scratchpad clear [id] | Clear scratchpad (current task or by id) | | test [--file <path>] | Run tests | | typecheck | Run typecheck | | package add <name> [--dev] | Install a package | | package remove <name> | Uninstall a package | | mcp | Run MCP server (stdio) |

Argument Patterns

agent-forge uses three argument styles:

Positional — the argument value is placed directly after the command:

agent-forge task read task_0001
agent-forge decision read dec_0001
agent-forge package add valibot

Flags — prefixed with --, value follows as the next argument or after =:

agent-forge test --file test/router.test.ts
agent-forge test --file=test/router.test.ts
agent-forge task complete task_0001 --no-commit

JSON string — complex structured data as a single quoted argument:

agent-forge task create '{"title":"My task","goal":"..."}'
agent-forge decision add '{"title":"Use SQLite","context":"..."}'
agent-forge scratchpad set '{"objective":"...","findings":[...]}'

Tasks

Tasks represent units of work. Each task has a title, goal, scope (affected files), acceptance criteria, and validation commands. Tasks are stored in SQLite and auto-numbered (task_0001, task_0002, ...).

task create

agent-forge task create '<json>'

JSON schema:

{
  "title": "string (required, max 200 chars)",
  "goal": "string (required, max 2000 chars)",
  "scope": {
    "files": ["src/router.ts", "src/tree.ts"]
  },
  "acceptanceCriteria": [
    "GET /health resolves to correct handler",
    "All existing tests still pass"
  ],
  "validation": [
    "agent-forge test --file test/router.test.ts",
    "agent-forge typecheck"
  ]
}

Output:

{"success":true,"result":{"id":"task_0001","title":"Add auth"}}

task current

Returns the most recently updated task with status active. This is the task the agent should be working on. Returns null if no active task exists.

agent-forge task current

{"success":true,"result":{"task":{"id":"task_0001","title":"Add auth","status":"active"}}}

task list

Returns a compact list of all tasks — id, title, and status only. Ordered by creation date descending.

agent-forge task list

{"success":true,"result":{"tasks":[{"id":"task_0001","title":"Add auth","status":"active"}]}}

task read

Returns the full task details for a given id.

agent-forge task read task_0001

task check

Verifies that all required validation commands have been run and passed since the task was last updated. Returns the list of missing evidence if incomplete.

agent-forge task check task_0001

{"success":true,"result":{"complete":false,"missing":["Missing validation: npm test"]}}

Validation runs are matched by kind (test, typecheck, etc.) and must have created_at after the task's updated_at. Only the most recent run of each kind is considered.

task complete

Marks a task as completed. Refuses if any validation evidence is missing.

agent-forge task complete task_0001
agent-forge task complete task_0001 --no-commit

On success, the task's scratchpad is automatically cleared.

Auto-commit: If autoCommit is enabled (default: on) and the project is a git repository, the command stages all changes (git add -A) and commits with a structured message:

task_0001: Add auth

Goal: Implement login functionality
Checklist:
- Users can log in
- All existing tests still pass

Use --no-commit to skip the commit. Configure via .agent/config.json:

{ "autoCommit": false }

task update

Updates one or more task fields without affecting others. All fields are optional — only provided fields change.

agent-forge task update task_0001 '{"title":"New title","goal":"New goal"}'
agent-forge task update task_0001 '{"scope":{"files":["src/new.ts"]}}'

JSON schema (all fields optional):

{
  "title": "string (max 200 chars)",
  "goal": "string (max 2000 chars)",
  "scope": { "files": ["src/file.ts"] },
  "acceptanceCriteria": ["Updated criteria"],
  "validation": ["npm test"]
}

task set-status

Sets the task status explicitly. Useful for marking tasks as blocked, cancelled, or in review.

agent-forge task set-status task_0001 blocked
agent-forge task set-status task_0001 cancelled
agent-forge task set-status task_0001 in_review

Valid statuses: active, completed, cancelled, blocked, in_review.

task search

Searches tasks by title or goal using FTS5 full-text search (falls back to LIKE when FTS5 is unavailable).

agent-forge task search auth
agent-forge task search "login feature"

Decisions

Decisions are lightweight Architecture Decision Records stored in SQLite. Use them to capture why the project chose a particular approach.

decision add

agent-forge decision add '<json>'

JSON schema:

{
  "title": "Use Valibot for CLI input validation",
  "context": "Agent tools receive JSON input and can execute side effects.",
  "decision": "Every command validates input with Valibot before execution.",
  "reason": "This keeps tool calls typed, safe, and predictable.",
  "consequences": "Every new command needs a schema.",
  "alternatives": ["Zod", "manual validation"],
  "tags": ["validation", "cli"]
}

consequences and alternatives are optional.
Tags are required and used for filtering in search.

decision search

Searches decisions by text query and/or tags. Uses SQLite FTS5 full-text search when available, falling back to LIKE queries.

agent-forge decision search --query validation
agent-forge decision search --tags cli,architecture
agent-forge decision search --query valibot --tags cli

Returns compact results (id, title, tags) only — not full decision bodies.

decision read

Returns the full decision for a given id.

agent-forge decision read dec_0001

Scratchpad

The scratchpad is a task-scoped working memory for the agent's temporary reasoning state. Each scratchpad belongs to exactly one task and stores structured fields:

| Field | Type | Description | |---|---|---| | objective | string | Concise task goal | | status | enum | investigating, implementing, validating, blocked, waiting | | hypothesis | string | Current theory about the problem | | inspected | array | [{file, reason, relevance}] — files examined | | findings | array | [{description, confidence}] — evidence (high/med/low) | | plannedActions | array | string[] — work still to do | | completedActions | array | string[] — work already done | | blockers | array | [{description, severity}] — obstacles (high/med/low) |

scratchpad set

Creates or updates the scratchpad for the current active task. All fields are optional — only provided fields are merged (arrays concatenate, scalars replace).

Set "mode": "replace" to replace arrays entirely instead of concatenating — useful for pruning stale entries after summarizing:

agent-forge scratchpad set '{"findings":[{"description":"Root cause identified","confidence":"high"}],"mode":"replace"}'

agent-forge scratchpad set '{"objective":"Fix login","status":"investigating","hypothesis":"Bug in router.ts","findings":[{"description":"Missing validation","confidence":"high"}],"plannedActions":["Add middleware"],"blockers":[{"description":"Need spec","severity":"medium"}]}'

scratchpad read

Returns the scratchpad for the current active task, or for a specific task by id. Returns null if no scratchpad exists.

agent-forge scratchpad read
agent-forge scratchpad read task_0001

scratchpad clear

Deletes the scratchpad for the current active task, or for a specific task by id.

agent-forge scratchpad clear
agent-forge scratchpad clear task_0001

Lifecycle: Scratchpads are automatically cleared when the task is completed via task complete. Keep the scratchpad small — it represents the minimum info needed to continue, not a full journal. Obsolete findings and completed actions should be summarized or pruned.

Validation

test

Runs the configured test command and returns compact JSON. Only failed tests are included in the output; passing tests are summarized as "passed": true.

agent-forge test
agent-forge test --file test/router.test.ts

{"success":true,"result":{"passed":true}}

On failure:

{"success":true,"result":{"passed":false,"failedTests":[{"file":"test/router.test.ts","name":"static route matching","error":"Expected X, got Y"}]}}

The test command is configured in .agent/config.json:

{ "testCommand": "npx vitest run" }

typecheck

Runs the configured typecheck command and returns compact JSON. Only TypeScript errors are included.

agent-forge typecheck

{"success":true,"result":{"passed":true}}

On failure:

{"success":true,"result":{"passed":false,"errors":[{"file":"src/router.ts","line":42,"message":"Type 'string' is not assignable to type 'number'"}]}}

The typecheck command is configured in .agent/config.json:

{ "typecheckCommand": "npx tsc --noEmit --pretty false" }

Validation runs are stored in the database as evidence. They are used by task check and task complete to verify that validation completed successfully after the task was last updated.

Package Management

Package operations go through the project's package manager CLI (npm, pnpm, or yarn). Do not manually edit package.json.

agent-forge package add valibot
agent-forge package add vitest --dev
agent-forge package remove lodash

Package names are validated with a strict regex: /^(?:@[a-z0-9-~][a-z0-9-._~]*\/)?[a-z0-9-~][a-z0-9-._~]*$/ (allows ., _, ~ in names).

The package manager is configured in .agent/config.json:

{ "packageManager": "npm" }

Configuration

Project configuration lives in .agent/config.json:

{
  "testCommand": "npx vitest run",
  "typecheckCommand": "npx tsc --noEmit --pretty false",
  "packageManager": "npm",
  "autoCommit": true
}

| Field | Default | Description | |---|---|---| | testCommand | npx vitest run | Shell command for running tests | | typecheckCommand | npx tsc --noEmit --pretty false | Shell command for typecheck | | packageManager | npm | Package manager: npm, pnpm, or yarn | | autoCommit | true | Auto-commit to git on task complete |

The file is created by init with defaults. Missing fields fall back to defaults.

Output Format

Every command returns exactly one line of compact JSON on stdout:

Success:

{"success":true,"result":{...}}

Error (exit code 1):

{"success":false,"error":"Task not found"}

Validation errors include field paths:

{"success":false,"error":"Validation error: title: Invalid length; goal: Invalid length"}

MCP

agent-forge includes a stdio-based MCP (Model Context Protocol) server. It enables AI coding agents to call all commands as MCP tools with structured input/output rather than shelling out.

npx agent-forge mcp

The server exposes 20 tools covering all functionality:

| Tool | Description | |---|---| | init | Initialize project | | task_create | Create a task | | task_current | Get current active task | | task_list | List all tasks | | task_read | Read task by id | | task_check | Check task readiness | | task_complete | Complete a task | | task_update | Update task fields | | task_set_status | Set task status | | task_search | Search tasks | | decision_add | Add a decision | | decision_search | Search decisions | | decision_read | Read decision by id | | scratchpad_set | Set/update scratchpad | | scratchpad_read | Read scratchpad | | scratchpad_clear | Clear scratchpad | | test | Run tests | | typecheck | Run typecheck | | package_add | Install a package | | package_remove | Uninstall a package |

The init command automatically writes MCP configuration files for OpenCode (.opencode.json), Claude Code (.claude/settings.json), and Codex CLI (.codex/settings.json), so the MCP server is available immediately after initialization.

The MCP server uses direct function calls (not shell-out) for better performance and error handling.

License

MIT