@zg3n/agentfactory

v1.0.2

Published

8 days ago

Agent Factory is an installable TypeScript package for running deterministic multi-agent software delivery workflows inside any repository. Install it locally or globally, run `agentfactory init` in a target repo, then use `agentfactory dashboard` or `age

0High
0Medium
0Low

zerog3n

Agent Factory

Agent Factory is an installable TypeScript package for running deterministic multi-agent software delivery workflows inside any repository. Install it locally or globally, run agentfactory init in a target repo, then use agentfactory dashboard or agentfactory sample from that repo.

The template is intentionally small:

ai/ contains the agent runtime, orchestration logic, decision engine, guardrail engine, dashboard server, and integration layer.
.agents/ contains the project-specific operating model: agent definitions, context, skills, decision tables, guardrails, playbooks, and integration configuration.
src/ is reserved for the application or generated code that the agents work on.
infra/ is reserved for infrastructure modules, stacks, constructs, environment configuration, variables, and deployment notes.
dist/ is build output from TypeScript.

What It Does

The runtime turns a TaskRequest into a controlled delivery loop:

The planner reads context and decision tables, then creates owned implementation steps.
The orchestrator validates the plan against guardrails and assigns builder workers.
Builders produce structured file changes for their owned paths.
The tester creates validation assets and commands.
The reviewer checks correctness, architecture fit, test coverage, and guardrail compliance.
The orchestrator retries, replans, fails, or completes the run.
Deployment decisions route infrastructure work through IaC-tool and provider-specific rules.
The dashboard exposes run state, messages, integrations, context controls, and server-sent events.

The default sample task still generates a small Node/TypeScript REST API with a versioned endpoint, tests, and review evidence. The template context and decision tables now support larger project shapes: REST, WebSocket, frontend dashboard, and optional infrastructure using Terraform, Terragrunt, OpenTofu, Pulumi, or CDK.

Requirements

Node.js 22+ recommended
npm or yarn
TypeScript
Bun or Deno when the generated project targets those runtimes
Terraform, Terragrunt, OpenTofu, Pulumi, or CDK when working on infra/

Install dependencies:

npm install

Build the package and expose the local CLI:

npm run build
npm link

Initialize another repository:

cd /path/to/target-repo
agentfactory init

The init command scaffolds .agents/ policy, context, decision tables, playbooks, skills, integration config, safe default permissions, and common AI tool context files into the target repo. It writes .agents/permissions.yaml with approvedGitRepositories: [.] and enableScripts: false; widen those settings only in repos that intentionally need it. It also writes AGENTS.md as the canonical cross-tool instruction file plus thin CLAUDE.md and GEMINI.md adapters that import it. It does not copy the package runtime into the target; agentfactory dashboard and agentfactory sample run from the installed package while reading .agents/ from the current repo. Use agentfactory init --no-wizard for a non-interactive scaffold, agentfactory init --no-tool-context to skip AGENTS.md/CLAUDE.md/GEMINI.md, or agentfactory init --force to replace existing scaffold files.

At runtime, the package binary loads the installed Agent Factory code and the project loader validates the target repo's .agents contract before starting work. The loader reads .agents/agents.yaml, .agents/guardrails.yaml, .agents/context/, .agents/decision-tables/, and optional .agents/permissions.yaml. With the default permissions, file changes can still be planned or applied according to task mode, but shell commands are skipped unless enableScripts is deliberately set to true.

Build the framework:

npm run build

Run the sample orchestration:

agentfactory sample

Run the stack initialization wizard:

agentfactory init

Start the dashboard:

agentfactory dashboard

By default the dashboard listens on http://0.0.0.0:4321. Override with:

HOST=127.0.0.1 PORT=4322 agentfactory dashboard

Repository Layout

.agents/
  agents.yaml              Agent roles, tools, schemas, max instances, and skills
  context/                 Product, constraints, tech stack, and dashboard context controls
  decision-tables/         YAML rules evaluated by the decision engine
  guardrails.yaml          Forbidden actions, required practices, and operational limits
  integrations.yaml        Slack, Discord, Telegram, Signal, and webhook integration config
  playbooks/               Task-type playbooks
  skills/                  Role-specific SKILLS.md files

ai/
  agents/                  Planner, builder, tester, reviewer, orchestrator
  core/                    Config loading, decisions, execution, guardrails
  dashboard/               Dashboard runtime, HTTP API, static UI, integrations
  sample.ts                CLI sample runner
  sampleTask.ts            Sample task factory
  types.ts                 Shared types

src/
  Application or generated project code

infra/
  Infrastructure modules, stacks, constructs, environments, variables, outputs, and deployment notes

Defining the Tech Stack

Define the intended stack in .agents/context/tech-stack.md. Agents use this as repository context when planning and generating changes.

The fastest path is the interactive wizard:

npm run build
npm run init:wizard

The wizard prompts for project name, runtime, language, package manager, REST/WebSocket capability, dashboard capability, persistence posture, IaC tool, cloud provider, and chat integrations. It then rewrites .agents/context/tech-stack.md.

Include:

Runtime and language: Node.js, Bun, or Deno
Frameworks or libraries to prefer
Package manager
Build command
Test command
Persistence choices
REST API style
WebSocket protocol expectations
Frontend dashboard stack, if any
Required IaC tool selection: Terraform, Terragrunt, OpenTofu, Pulumi, or CDK
Cloud provider targets: AWS, DigitalOcean, Google Cloud, Microsoft Azure, Vultr, OVHcloud
Repository layout rules

Example:

# Tech Stack

- Default runtime: Node.js
- Alternative runtimes: Bun or Deno when requested by task context
- Language: TypeScript
- Package manager: npm
- API runtime: built-in node:http for small services
- WebSocket runtime: explicit protocol and bounded message handling
- Frontend dashboard: browser-native TypeScript or the selected application framework
- Infrastructure: optional; use the selected IaC tool under infra/
- Cloud providers: AWS, DigitalOcean, Google Cloud, Microsoft Azure, Vultr, OVHcloud
- Tests: node:test and node:assert/strict
- Dashboard runtime: node:http with server-sent events

Repository layout:
- ai/ contains the agentic framework.
- src/ contains application code.
- test/ contains application tests.
- infra/ contains infrastructure code when enabled.

Keep this file operational. It should tell agents what to build with, not market the project.

Agent Definitions

Agents are declared in .agents/agents.yaml. The current roles are:

planner: decomposes tasks into deterministic steps.
builder: creates file changes for owned paths.
tester: creates and runs validation.
reviewer: validates correctness and quality.
orchestrator: coordinates the loop.

Each agent has:

role: its responsibility.
allowed_tools: capabilities the runtime expects it to use.
output_schema: the structured output contract.
skills: named skills backed by .agents/skills/<agent>/SKILLS.md.
max_instances: optional concurrency limit, currently used for builders.

Controlling the Number of Agents and Roles

The easiest control is builder parallelism.

In .agents/agents.yaml:

agents:
    builder:
        role: Produces diff-based code changes for an assigned file slice without bypassing guardrails.
        max_instances: 4

In .agents/guardrails.yaml, keep the operational limit aligned:

operational_limits:
    max_parallel_builders: 4

The orchestrator uses the smaller of:

builder.max_instances
guardrails.operational_limits.max_parallel_builders
the planner's builderPoolTarget

To reduce parallelism, lower either builder.max_instances or max_parallel_builders. To increase it, raise both and make sure planner steps have non-overlapping ownedPaths.

Adding a new role requires code changes:

Add the role to AgentName in ai/types.ts.
Add its definition to .agents/agents.yaml.
Implement the agent under ai/agents/.
Wire it into ai/agents/orchestrator.ts.
Add guardrail validation if the role can produce file changes, commands, or decisions.
Add .agents/skills/<role>/SKILLS.md.

For most projects, prefer adding skills or decision tables before adding a new role.

Skills

Skills are named capabilities assigned to agents in .agents/agents.yaml and documented under .agents/skills/.

Current skill docs:

Use skills to describe durable operating behavior, such as runtime selection, API implementation, WebSocket server development, frontend dashboard development, IaC authoring, cloud security review, runtime smoke testing, failure routing, or parallel work planning. Use decision tables for specific conditional choices.

Decision Tables

Decision tables live in .agents/decision-tables/. They are YAML rule sets that agents evaluate before making material choices.

A rule includes:

id
condition
action
rationale

The planner, builder, tester, and reviewer attach matching decisions to their structured outputs. This makes runs auditable: the final output shows which architecture, API, testing, security, and deployment rules were applied.

Guardrails

Guardrails live in .agents/guardrails.yaml. They define:

forbidden actions
required practices
operational limits
protected paths
allowed and destructive command prefixes

The orchestrator validates plans, builder outputs, tester outputs, and reviewer outputs before accepting them. Guardrails are the main place to restrict blast radius.

Running Tasks

You can start tasks through the dashboard API.

Start the dashboard:

agentfactory dashboard

Start the sample run:

curl -X POST http://127.0.0.1:4321/api/runs/sample

Start a custom task:

curl -X POST http://127.0.0.1:4321/api/tasks \
  -H 'content-type: application/json' \
  -d '{
    "id": "hello-api",
    "title": "Build a simple REST API",
    "description": "Create one greeting endpoint with tests.",
    "taskType": "feature",
    "targetDirectory": "tmp/hello-api",
    "executionMode": "dry_run",
    "capabilities": {
      "requiresApi": true,
      "endpointCount": 1,
      "requiresPersistence": false,
      "hasExistingDatabase": false,
      "changesCode": true,
      "touchesSecuritySurface": true,
      "requiresWebSocket": false,
      "requiresFrontendDashboard": false,
      "requiresInfrastructure": false,
      "runtimeTarget": "node",
      "iacTool": "terraform"
    }
  }'

Use executionMode: "dry_run" to inspect planned changes without writing generated output to the target directory. Use executionMode: "apply" when you want the execution engine to apply file changes and run validation commands.

For a broader backend/frontend/infrastructure task, include the optional capability fields:

{
  "capabilities": {
    "requiresApi": true,
    "endpointCount": 4,
    "requiresWebSocket": true,
    "requiresFrontendDashboard": true,
    "requiresInfrastructure": true,
    "runtimeTarget": "bun",
    "iacTool": "pulumi",
    "cloudProvider": "aws",
    "requiresPersistence": true,
    "hasExistingDatabase": false,
    "changesCode": true,
    "touchesSecuritySurface": true
  }
}

Supported runtimeTarget values are node, bun, and deno.

Supported iacTool values are terraform, terragrunt, opentofu, pulumi, and cdk.

Supported cloudProvider values are aws, digitalocean, gcp, azure, vultr, and ovhcloud.

Dashboard API

Important endpoints:

GET / serves the dashboard UI.
GET /api/overview returns runs, integrations, messages, metrics, and context.
GET /api/runs lists runs.
GET /api/runs/:id returns a run.
POST /api/runs/sample starts the sample task.
POST /api/tasks starts a custom task.
GET /api/integrations lists chat and webhook integrations.
POST /api/integrations creates or updates an integration.
POST /api/chat submits a dashboard chat prompt.
GET /api/context reads context document/control state.
POST /api/context updates context document/control state.
GET /api/events streams server-sent events.

Using the Project With Codex Agents

This repo is designed to be worked on by coding agents.

Recommended Codex session:

codex -C /path/to/project-template --sandbox workspace-write --ask-for-approval on-request

If you need to edit protected local config like .agents/, launch a short higher-access session:

codex -C /path/to/project-template --sandbox danger-full-access --ask-for-approval on-request

Useful agent workflows:

Ask one agent to inspect decision tables while another inspects dashboard behavior.
Assign implementation work by disjoint files to avoid conflicts.
Split backend, frontend, and infrastructure work into separate owned paths: src/, dashboard/application UI paths, and infra/.
Use tool/provider-aware tasks when asking for infrastructure: include requiresInfrastructure, iacTool, and cloudProvider.
Keep .agents/context/tech-stack.md, .agents/guardrails.yaml, and .agents/agents.yaml current before asking agents to generate code.
Use dry_run for exploratory runs and apply for controlled execution.

MCP Usage

MCP servers can extend an operator or coding agent with external tools. This template does not require a specific MCP server, but it works well with MCP-enabled Codex sessions for browser automation, repo tools, issue trackers, docs, or chat systems.

Configure MCP in your Codex config, for example:

[mcp_servers.chrome-devtools]
command = "npx"
args = ["chrome-devtools-mcp@latest"]

Then use MCP-backed tools to:

inspect the dashboard in a browser
capture screenshots
test interactive UI flows
connect repository state to external systems
inspect external documentation while planning changes

MCP is an operator capability. The project runtime itself exposes HTTP APIs and integration configuration; MCP tools help agents operate and verify the project.

Chat and Program Integrations

Integrations are stored in .agents/integrations.yaml and can be managed through the dashboard or API.

Supported integration types:

slack: sends messages through Slack chat.postMessage
telegram: sends messages through Telegram Bot API
discord: sends messages through a Discord webhook
webhook: sends JSON to a configured webhook URL
signal: stored and displayed, but outbound delivery is not implemented yet

Secrets are referenced through environment variable names. Do not store tokens directly in .agents/integrations.yaml.

Slack

Required settings:

botTokenEnv: environment variable containing the Slack bot token
channel: channel id or channel name accepted by Slack

Example:

export SLACK_BOT_TOKEN='xoxb-...'
curl -X POST http://127.0.0.1:4321/api/integrations \
  -H 'content-type: application/json' \
  -d '{
    "type": "slack",
    "name": "Engineering Slack",
    "enabled": true,
    "settings": {
      "botTokenEnv": "SLACK_BOT_TOKEN",
      "channel": "#agent-runs"
    }
  }'

Required settings:

botTokenEnv: environment variable containing the Telegram bot token
chatId: target chat id

export TELEGRAM_BOT_TOKEN='...'
curl -X POST http://127.0.0.1:4321/api/integrations \
  -H 'content-type: application/json' \
  -d '{
    "type": "telegram",
    "name": "Telegram Ops",
    "enabled": true,
    "settings": {
      "botTokenEnv": "TELEGRAM_BOT_TOKEN",
      "chatId": "123456789"
    }
  }'

Discord

Required settings:

webhookEnv or urlEnv: environment variable containing the Discord webhook URL

export DISCORD_WEBHOOK_URL='https://discord.com/api/webhooks/...'
curl -X POST http://127.0.0.1:4321/api/integrations \
  -H 'content-type: application/json' \
  -d '{
    "type": "discord",
    "name": "Discord Ops",
    "enabled": true,
    "settings": {
      "webhookEnv": "DISCORD_WEBHOOK_URL"
    }
  }'

Generic Webhook

Required settings:

urlEnv: environment variable containing the webhook URL

export AGENT_WEBHOOK_URL='https://example.com/agent-events'
curl -X POST http://127.0.0.1:4321/api/integrations \
  -H 'content-type: application/json' \
  -d '{
    "type": "webhook",
    "name": "Ops Webhook",
    "enabled": true,
    "settings": {
      "urlEnv": "AGENT_WEBHOOK_URL"
    }
  }'

Sending a Message

Use /api/messages for explicit outbound delivery:

curl -X POST http://127.0.0.1:4321/api/messages \
  -H 'content-type: application/json' \
  -d '{
    "direction": "outbound",
    "sender": "agent-dashboard",
    "body": "Build completed.",
    "integrationId": "discord-discord-ops",
    "transport": "discord"
  }'

Use /api/chat to submit an operator prompt. Without runId, the dashboard interprets the prompt as a new task. With runId, it attaches follow-up context to that run.

curl -X POST http://127.0.0.1:4321/api/chat \
  -H 'content-type: application/json' \
  -d '{
    "sender": "dashboard-operator",
    "body": "Build a health endpoint with tests"
  }'

Customizing for a New Project

Update .agents/context/product.md with what the project is.
Update .agents/context/tech-stack.md with the real stack.
Update .agents/context/constraints.md with hard rules.
Review .agents/guardrails.yaml for protected paths and command limits.
Adjust .agents/agents.yaml skill lists and builder count.
Add or revise .agents/decision-tables/*.yaml for architecture, API, security, testing, or domain-specific rules.
Add role guidance under .agents/skills/<agent>/SKILLS.md.
Run npm run build and npm run sample.
Start the dashboard and create a dry-run task.

Development Notes

Build:

npm run build

Sample:

npm run sample

Dashboard:

npm run dashboard

The project is currently a deterministic local prototype. It does not call an LLM provider by itself. The "agents" are TypeScript classes that use repository context, decision tables, guardrails, and structured outputs. External coding agents and MCP-enabled tools can operate this repo, extend it, or wire it to model providers.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Agent Factory

What It Does

Requirements

Repository Layout

Defining the Tech Stack

Agent Definitions

Controlling the Number of Agents and Roles

Skills

Decision Tables

Guardrails

Running Tasks

Dashboard API

Using the Project With Codex Agents

MCP Usage

Chat and Program Integrations

Slack

Telegram

Discord

Generic Webhook

Sending a Message

Customizing for a New Project

Development Notes