@javiayala/ai-workers

v0.2.1

Published

2 months ago

Portable Pi/GLM worker tools for offloading low-risk agent work from premium coding agents.

0High
0Medium
0Low

javiayala

pi-package codex mcp agent zai glm

ai-workers

Portable Pi/GLM worker tools for stretching premium coding-agent usage.

ai-workers lets a strong primary agent, such as Codex, offload low-risk work to cheaper Pi workers backed by a Z.ai GLM subscription. The primary agent keeps responsibility for architecture, final edits, security-sensitive reasoning, and verification. GLM workers handle reconnaissance, summarization, drafts, and first-pass reviews.

The MCP tools are intentionally phrased as generic code-intelligence operations so capable harnesses can discover and choose them for search, summarization, planning, drafting, review, and parallel context gathering without needing to reason about the underlying cost model.

Goal

The goal is to conserve valuable OpenAI/Codex subscription usage by moving token-heavy but low-judgment work to Pi running GLM models.

Good worker tasks:

Search and summarize several files.
Compress large context into a short handoff.
Draft boilerplate, tests, docs, release notes, or implementation outlines.
Run cheap first-pass review before Codex makes the final call.
Run several independent reconnaissance tasks in parallel.

Keep in Codex or another strong primary agent:

Architecture and product decisions.
Final patches and commits.
Security-sensitive auth, billing, secrets, or customer-data handling.
Subtle debugging and root-cause analysis.
Final verification claims.

Current Model Routing

Defaults live in src/config.js and are written to ~/.config/ai-workers/config.json by ai-workers install.

| Role | Default Model | Pi Tools | Intended Use | | --- | --- | --- | --- | | scout | zai/glm-4.5-air | read,grep,find,ls | Cheap read-only codebase reconnaissance | | summarize | zai/glm-4.5-air | none | Compress supplied stdin/context | | draft | zai/glm-4.5-air | read,grep,find,ls | Low-risk draft material | | review | zai/glm-5.1 | read,grep,find,ls | First-pass quality review | | plan | zai/glm-5.1 | read,grep,find,ls | Implementation outline for later review |

glm-4.5-air is the default cheap worker. glm-5.1 is reserved for tasks that need better judgment but should still avoid spending OpenAI usage.

Package Structure

ai-workers/
  bin/                         # Global executable entrypoints
  codex/skills/ai-workers/     # Codex skill installed by ai-workers install
  pi/prompts/                  # Pi prompt templates packaged for Pi users
  pi/skills/ai-workers/        # Pi skill packaged for Pi users
  src/                         # Runtime implementation
  test/                        # Node test suite
  package.json                 # npm package metadata and binary map

Important files:

src/config.js: default role-to-model routing and config file loading.
src/prompts.js: prompt wrapper sent to Pi workers, including safety rules.
src/search.js: optional CocoIndex Code search wrapper used by ai-workers search and worker prefetch.
src/pi-runner.js: builds and runs the pi subprocess invocation.
src/mcp-tools.js: MCP tool definitions and tool-to-role mapping.
src/mcp-server.js: stdio MCP server used by Codex.
src/cli.js: CLI command dispatcher, installer, doctor checks, and parallel command.
src/installer.js: install, upgrade, uninstall, and doctor implementation.
codex/skills/ai-workers/SKILL.md: instructions Codex loads when this workflow is relevant.
pi/skills/ai-workers/SKILL.md: equivalent Pi-side workflow instructions.

Install

From this checkout during local development:

npm install
npm link
ai-workers install

After this package is published:

npm install -g @javiayala/ai-workers
ai-workers install

ai-workers install does four things:

Writes default config to ~/.config/ai-workers/config.json if it does not already exist.
Copies the Codex skill to ~/.codex/skills/ai-workers/SKILL.md.

Registers the Codex MCP server:

codex mcp add ai-workers -- ai-workers-mcp

Registers this package with Pi:
```
pi install npm:@javiayala/ai-workers
```

Restart Codex after installation so the new MCP server and skill are visible in a fresh session.

Upgrade

ai-workers upgrade

Reinstalls the skill and refreshes registrations while preserving your config.

Uninstall

ai-workers uninstall

Removes the Codex skill and MCP registration. Config is preserved by default; add --config to remove it too.

Install flags

--dry-run: print actions without writing files or running registrations.
--force: overwrite generated files even if present.
--no-codex: skip Codex skill and MCP registration.
--no-pi: skip Pi package registration.

Prerequisites

Node.js 20 or newer.
Codex CLI installed and authenticated.
Pi coding agent installed and authenticated.
Z.ai access configured for Pi, usually through ZAI_API_KEY or Pi's own auth flow.
glm-4.5-air and glm-5.1 visible to Pi.

Optional: CocoIndex Code (ccc) for semantic code search:

uv tool install --upgrade 'cocoindex-code[full]'
ccc init
ccc index
codex mcp add cocoindex-code -- ccc mcp
npx skills add cocoindex-io/cocoindex-code -g -a '*' -y --copy

Check the environment:

ai-workers doctor

The doctor command verifies:

node --version
pi --version
codex --version
pi --provider zai --model glm-4.5-air ...
pi --provider zai --model glm-5.1 ...

CLI Commands

Use these directly from a shell or from any coding agent that can run commands.

ai-scout "Find the files involved in authentication and summarize the flow"

Runs a cheap read-only scout through Pi using glm-4.5-air.

cat large-file.ts | ai-summarize "Summarize the public API and risky areas"

Uses glm-4.5-air with no tools. This is best for compressing supplied input.

ai-draft "Draft Pest tests for the parser module based on nearby test style"

Generates draft material for the primary agent to inspect before applying.

ai-review "Review the current diff for concrete bugs and missing tests"

Uses glm-5.1 for a stronger but still cheaper first-pass review.

ai-plan "Create an implementation plan and file map for adding result caching"

Uses glm-5.1 for a first-pass implementation outline, risks, and verification steps.

ai-workers auto "Map the repository areas needed to add result caching"

Uses planning plus parallel worker execution to turn one broad request into a compact handoff.

ai-workers search "where request retry logic is implemented"

Uses CocoIndex Code (ccc) for AST-aware semantic code search. This is optional: when ccc is installed, ai-workers also prefetches semantic search results for scout, review, and plan workers so Pi/GLM starts from narrower context. If ccc is missing, normal worker commands continue without search prefetch.

ai-workers parallel \
  --task "scout: inspect auth routing" \
  --task "scout: inspect database models" \
  --task "review: inspect the current diff"

Runs several Pi workers concurrently. This is the closest CLI equivalent to spawning cheap subagents.

Output and runtime options

Role commands, parallel, and auto support --json for normalized machine-readable output:

ai-review --json "Review the current diff"

All role commands support --cwd PATH:

ai-scout --cwd /path/to/repo "Find cache-related code"

Per-worker timeout and retry controls:

ai-scout --timeout 60000 "inspect auth routes"
ai-review --retries 1 "review current diff"
ai-workers parallel --timeout 90000 --retries 1 --task "scout: inspect api"

--timeout MS: kill a hung worker after MS milliseconds (default 120000).
--retries N: retry transient failures up to N times (default 0).
--retry-delay MS: wait between retries (default 1000).

Retries increase Pi/GLM usage. Timeout applies per worker. For auto, it applies to the planner and each planned worker.

Codex MCP Tools

After ai-workers install and a Codex restart, Codex can call these MCP tools:

ai_workers_scout
ai_workers_summarize
ai_workers_draft
ai_workers_review
ai_workers_plan
ai_workers_auto
ai_workers_parallel

Recommended Codex usage:

Ask Codex to use ai_workers_search or the direct cocoindex-code MCP server for semantic search before broad file reads.
Ask Codex to use ai_workers_parallel for independent reconnaissance.
Ask Codex to use ai_workers_scout before it reads many files itself.
Ask Codex to use ai_workers_summarize to compress large logs, diffs, or docs.
Ask Codex to use ai_workers_plan for first-pass implementation maps and task breakdowns.
Ask Codex to use ai_workers_auto for broad requests that should be split into independent searches, plans, drafts, or reviews.
Ask Codex to use ai_workers_review before finalizing a patch.

This does not replace Codex's native spawn_agent. Native Codex subagents still consume OpenAI-backed model usage. The cheap path is MCP or CLI calls that spawn Pi/GLM subprocesses.

MCP runtime options

All MCP tools accept optional timeout and retry fields:

timeoutMs: kill a hung worker after this many milliseconds.
retries: retry transient failures up to this many times.
retryDelayMs: wait this many milliseconds between retries.

Example:

{
  "task": "Inspect auth routing",
  "timeoutMs": 60000,
  "retries": 1
}

How It Works

The runtime path is:

Codex MCP tool or CLI command
  -> src/mcp-tools.js or src/cli.js
  -> src/pi-runner.js
  -> pi --provider zai --model <role-model> --print --no-session --no-context-files
  -> GLM worker output
  -> primary agent reviews and decides

For read-capable roles, Pi is invoked with:

--tools read,grep,find,ls

For summarization, Pi is invoked with:

--no-tools

Every Pi invocation is ephemeral:

--no-session --no-context-files

This keeps worker calls focused and avoids pulling project AGENTS.md or CLAUDE.md context into cheap workers unless the primary agent explicitly includes relevant information in the task.

Prompt Contract

src/prompts.js wraps every worker task with a compact contract:

Do not read or output secrets, tokens, private keys, .env files, or auth files.
Prefer read-only investigation unless explicitly drafting.
If tools are unavailable, use only supplied input.
Do not make architecture decisions.
Keep output compact with evidence, risks, and next steps.
If exact output is requested, return only that exact output.

Worker output is always treated as untrusted context or a draft. The primary agent must verify before using it.

Structured Results

--json and MCP tools return normalized result objects with:

ok
role
task
cwd
output
findings
evidence
risks
next_steps
attempts
duration_ms

On failure, structured results also include:

error
timed_out

Parallel and auto calls wrap these objects in a top-level result with ok, cwd, and results.

Recommended Workflow

For broad feature work:

Codex makes the high-level plan.

Codex delegates independent scans:

ai-workers parallel \
  --task "scout: find routes and controllers for feature X" \
  --task "scout: find models and migrations for feature X" \
  --task "scout: find tests covering feature X"

Codex reads the compressed output instead of reading every file first.
Codex makes architecture decisions and applies edits.
Codex asks ai-review or ai_workers_review for a cheap first-pass review.
Codex runs real tests and final verification.

For logs or large docs:

cat output.log | ai-summarize "Extract failure causes, file paths, and suggested next checks"

For drafts:

ai-draft "Draft a README section explaining the installer behavior"

Use the draft as raw material, not final text.

Configuration

Default config path:

~/.config/ai-workers/config.json

Example:

{
  "configVersion": 1,
  "provider": "zai",
  "mode": "text",
  "timeoutMs": 120000,
  "retries": 0,
  "retryDelayMs": 1000,
  "parallelLimit": 8,
  "roles": {
    "scout": {
      "provider": "zai",
      "model": "glm-4.5-air",
      "tools": ["read", "grep", "find", "ls"]
    },
    "review": {
      "provider": "zai",
      "model": "glm-5.1",
      "tools": ["read", "grep", "find", "ls"]
    }
  }
}

You can change role models here if your subscription or preferred routing changes.

Safety Rules

Do not delegate:

.env files
API keys
private keys
auth tokens
password dumps
production customer data
payment data
anything governed by confidentiality requirements

Do not let cheap workers make final changes. They should return findings, drafts, or review notes. The primary agent owns final edits and verification.

Development

Install dependencies:

npm install

Run tests:

npm test

Package dry-run:

npm pack --dry-run

Test MCP tool discovery:

node --input-type=module -e "import { Client } from '@modelcontextprotocol/sdk/client/index.js'; import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js'; const transport = new StdioClientTransport({ command: 'ai-workers-mcp', args: [] }); const client = new Client({ name: 'smoke', version: '0.0.0' }, { capabilities: {} }); await client.connect(transport); const tools = await client.listTools(); console.log(tools.tools.map((tool) => tool.name).join(',')); await client.close();"

Run an end-to-end GLM smoke test:

printf 'alpha beta\n' | ai-summarize "Reply with exactly READY after reading input"

Expected output:

READY

Notes For LLM Maintainers

Keep this package technology-agnostic. Do not add Laravel, React, Python, or other stack-specific assumptions to the core prompts.
Add project-specific routing rules in the target project's AGENTS.md, not here.
Prefer changing defaults through src/config.js and tests together.
Keep worker prompts short. Long prompts spend more GLM tokens and make workers less predictable.
Do not grant write tools by default. If a future write-capable worker is added, keep it opt-in and clearly separated.
Treat ai_workers_parallel as cheap context gathering, not as a replacement for primary-agent judgment.
If adding new MCP tools, update src/mcp-tools.js, tests, and this README.
If changing install behavior, update src/installer.js, src/cli.js, this README, and rerun ai-workers install during verification.

Current Limitations

Parallel workers are plain concurrent Pi subprocesses, not native Codex subagents.
Codex native spawn_agent still uses OpenAI-backed agents.
Summarization quality depends on task wording and supplied input.
The installer currently targets npm-style global command availability.
No secret scanner blocks paths before Pi runs; safety is enforced by prompt contract and primary-agent discipline.

Release Checklist

Before publishing or using on a new machine:

npm test
npm pack --dry-run
ai-workers doctor

Then verify Codex sees the MCP server:

codex mcp list

And verify Pi sees the package:

pi list