clavue-agent-sdk

v0.7.4

Published

a day ago

In-process TypeScript agent SDK for controlled autonomous workflows, workflow contracts, proof-of-work artifacts, durable AgentJobs, memory, and orchestration policy.

Downloads

2,269

Clavue Agent SDK

Clavue Agent SDK is a production-oriented TypeScript SDK for building controlled autonomous agents, workflow agents, coding agents, research agents, and repair loops. It runs the full agent loop in-process for library integrations, with no subprocess or local CLI dependency. An optional npx CLI is available for terminal and CI automation. The same runtime supports Anthropic and OpenAI-compatible APIs, including modern GPT models through the OpenAI Responses API when available.

Clavue Agent SDK 是面向生产环境的 TypeScript agent runtime，可用于构建 coding agent、research agent、workflow agent 与自主修复循环。作为库集成时，它会在你的应用进程内直接运行完整 agent loop，不需要子进程，也不依赖本地 CLI。同时提供可选的 npx CLI，方便终端和 CI 自动化使用。它支持 Anthropic 与 OpenAI-compatible API，并可在可用时为现代 GPT 模型使用 OpenAI Responses API。

Also available in Go: clavue-agent-sdk-go

Documentation / 文档

Programmatic Integration Guide: complete usage patterns for embedding the SDK in services, CI, workers, internal platforms, and agent products.
Capability Upgrade Program: roadmap and capability planning for controlled autonomous workflows.
Production Capabilities: deeper analysis of production runtime capabilities and gaps.

Why Clavue / 为什么选择 Clavue

Library-first agent runtime: embed run(), query(), or createAgent() directly in Node.js services, CI, workers, web backends, and internal platforms.
Low-confirmation autonomy: use autonomyMode plus permissionMode to let strong models act proactively on approved work without bypassing safety policy.
Production controls: named toolsets, allow/deny filters, hooks, permission modes, workspace path guards, schema-versioned events, policy traces, quality gates, and budget controls.
Durable workflow contracts: background AgentJobs, local issue workflows, WORKFLOW.md parsing, proof-of-work artifacts, orchestration policy helpers, runtime namespaces, session persistence, memory injection, self-improvement capture, and retro/eval loops.
Provider portability: Anthropic Messages and OpenAI-compatible providers share the same tool, memory, event, and result contracts.
库优先的 agent runtime： 可直接把 run()、query() 或 createAgent() 嵌入 Node.js 服务、CI、worker、Web 后端和内部平台。
低确认自主执行： 通过 autonomyMode 与 permissionMode，让强模型在已授权任务中主动推进，同时不绕过安全策略。
生产级控制： 命名工具集、allow/deny、hooks、权限模式、workspace 路径防护、带 schema version 的事件、policy trace、quality gates 与预算控制。
持久化工作流契约： background AgentJobs、local issue workflow、WORKFLOW.md 解析、proof-of-work artifact、orchestration policy helper、runtime namespace、session persistence、memory injection、self-improvement capture 与 retro/eval loop。
多 provider 可移植： Anthropic Messages 与 OpenAI-compatible provider 共用同一套工具、记忆、事件和结果协议。

What is new in 0.7.x / 0.7.x 新能力

Controlled autonomous development mode: --autonomy autonomous with --permission-mode trustedAutomation or safer local-edit-only acceptEdits.
Public schema metadata for SDK events, run results, traces, AgentJobs, and memory traces.
Local issue workflow commands for builder, reviewer, fixer, and verifier loops.
SDK-native workflow contracts: parse WORKFLOW.md, render strict issue prompts, resolve runtime config, normalize workspaces, and validate dispatch preflight.
Host-neutral proof-of-work artifacts for runs, AgentJobs, and issue workflows, including evidence, quality gates, risks, next actions, and external references.
Host-neutral orchestration policy helpers for candidate selection, active/terminal state handling, global/per-state concurrency, blocker handling, and capped retry backoff.
Stronger workspace safety: path containment for file tools and pre-execution blocking for destructive shell patterns.
Better GPT/OpenAI-compatible handling: capability preflight, normalized provider errors, Responses API routing, and fallback when gateways do not support /responses.
AgentJob batch summaries, retro skill evaluators, runtime profiles, and richer policy decision traces.
受控自主开发模式：--autonomy autonomous 可与 --permission-mode trustedAutomation 或更安全的本地编辑模式 acceptEdits 配合使用。
SDK event、run result、trace、AgentJob 与 memory trace 都带有公开 schema metadata。
本地 issue workflow 命令支持 builder、reviewer、fixer、verifier 循环。
SDK 原生 workflow contract：解析 WORKFLOW.md、严格渲染 issue prompt、解析运行配置、规范化 workspace，并做 dispatch preflight 校验。
与宿主解耦的 proof-of-work artifact：覆盖 run、AgentJob 与 issue workflow，包含 evidence、quality gates、risks、next actions 与外部引用。
与宿主解耦的 orchestration policy helper：支持候选任务选择、active/terminal state 处理、全局/按状态并发、blocker 判断与有上限的 retry backoff。
更强 workspace 安全：文件工具路径限制，以及 shell 破坏性命令的执行前阻断。
更好的 GPT / OpenAI-compatible 支持：capability preflight、标准化 provider error、Responses API 路由，以及网关不支持 /responses 时的回退。
AgentJob batch summary、retro skill evaluator、runtime profile 与更丰富的 policy decision trace。

Quick start / 快速开始

Use directly with npx / 直接用 npx 运行

No local install is required for quick automation from a terminal or CI job.

终端或 CI 里可以直接用 npx 运行，不需要先安装到项目里。

export CLAVUE_AGENT_API_KEY=your-api-key
npx clavue-agent-sdk "Read package.json and summarize this project"

# Safer read-only review / 更安全的只读审查
npx clavue-agent-sdk "Review src for obvious bugs" --toolset repo-readonly

# Combine named toolsets / 组合命名工具集
npx clavue-agent-sdk "Research and review this repo" --toolset repo-readonly,research

# OpenAI-compatible model / OpenAI 兼容模型
npx clavue-agent-sdk \
  --api-type openai-completions \
  --model gpt-5.4 \
  --base-url https://api.openai.com/v1 \
  "Explain the repository structure"

# Opt-in run learning / 可选开启 run 自学习
npx clavue-agent-sdk \
  --self-improvement \
  --allow Read,Glob,Grep \
  "Review package.json for release readiness risks"

# Or enable it from CI/env / 也可以通过 CI/env 开启
CLAVUE_AGENT_SELF_IMPROVEMENT=true \
  npx clavue-agent-sdk --allow Read,Glob,Grep "Review package.json"

CLI options: --prompt, --model, --api-type, --api-key, --base-url, --cwd, --max-turns, --allow, --toolset, --deny, --self-improvement, --json.

Environment variables: CLAVUE_AGENT_API_KEY, CLAVUE_AGENT_API_TYPE, CLAVUE_AGENT_MODEL, CLAVUE_AGENT_BASE_URL, CLAVUE_AGENT_AUTH_TOKEN, CLAVUE_AGENT_SELF_IMPROVEMENT, AGENT_SDK_MAX_TOOL_CONCURRENCY.

命令行参数：--prompt、--model、--api-type、--api-key、--base-url、--cwd、--max-turns、--allow、--toolset、--deny、--self-improvement、--json。

环境变量：CLAVUE_AGENT_API_KEY、CLAVUE_AGENT_API_TYPE、CLAVUE_AGENT_MODEL、CLAVUE_AGENT_BASE_URL、CLAVUE_AGENT_AUTH_TOKEN、CLAVUE_AGENT_SELF_IMPROVEMENT、AGENT_SDK_MAX_TOOL_CONCURRENCY。

Best practices / 最佳使用实践

Pick the right integration mode / 选择合适的集成方式

Use npx clavue-agent-sdk ... for quick terminal automation, CI checks, and one-off repository analysis.
Use run() for backend jobs where you want one prompt in, one typed AgentRunResult out.
Use query() for streaming UIs, logs, dashboards, and integrations that need live assistant/tool events.
Use createAgent() for long-lived apps that need multi-turn state, sessions, hooks, MCP servers, custom subagents, or repeated prompts.
快速终端自动化、CI 检查、一次性仓库分析：使用 npx clavue-agent-sdk ...。
后端任务只需要“一次输入、一次结构化结果”：使用 run()。
前端 UI、日志面板、实时事件流：使用 query()。
长生命周期应用、多轮会话、hooks、MCP、自定义 subagent 或重复调用：使用 createAgent()。

Start narrow, then expand tools / 先收窄权限，再逐步扩展工具

Prefer the smallest tool surface that can complete the task. Start with read-only tools for review and analysis, then add write or shell tools only when the workflow needs them.

优先使用能完成任务的最小工具权限。审查和分析先从只读工具开始，只有在确实需要修改文件或执行命令时再增加写入或 shell 工具。

# Read-only repository review / 只读仓库审查
npx clavue-agent-sdk "Review this repo for release risks" \
  --toolset repo-readonly \
  --max-turns 6

# Focused code change with explicit tools / 明确授权工具的定向修改
npx clavue-agent-sdk "Fix the failing package payload test" \
  --allow Read,Glob,Grep,Edit,Bash \
  --permission-mode trustedAutomation \
  --autonomy autonomous \
  --max-turns 10

# Safer low-confirmation local edits / 更安全的低确认本地编辑
npx clavue-agent-sdk "Update usage docs and run verification" \
  --toolset repo-edit \
  --permission-mode acceptEdits \
  --autonomy autonomous \
  --max-turns 8

# CI-friendly JSON output / 适合 CI 的 JSON 输出
npx clavue-agent-sdk "Check whether package.json is release-ready" \
  --toolset repo-readonly \
  --json

Set `cwd`, model, and budgets explicitly / 显式设置 cwd、模型和预算

For automation, set cwd, model, maxTurns, and tool permissions explicitly so runs are reproducible and bounded.

自动化场景建议显式设置 cwd、model、maxTurns 和工具权限，让运行结果更可复现、成本和轮次更可控。

import { run } from "clavue-agent-sdk";

const result = await run({
  prompt: "Review the package for publish-readiness and return concise findings.",
  options: {
    cwd: process.cwd(),
    model: "claude-sonnet-4-6",
    toolsets: ["repo-readonly"],
    maxTurns: 6,
  },
});

if (result.status !== "completed") {
  throw new Error(result.errors?.join("\n") || result.subtype);
}

console.log(result.text);

Use structured outputs in automation / 自动化中使用结构化结果

In CI or services, prefer run() or CLI --json instead of scraping assistant text from stdout. Check status, subtype, errors, usage, and total_cost_usd before deciding whether a job passed.

在 CI 或服务端集成里，优先使用 run() 或 CLI --json，不要依赖解析普通文本输出。根据 status、subtype、errors、usage 和 total_cost_usd 判断任务是否成功。

Enforce production controls / 启用生产控制能力

For production hosts, combine narrow toolsets, permissionMode, qualityGatePolicy, memory policy, doctor(), and runBenchmarks() instead of relying only on prompt instructions.

生产宿主应组合使用最小工具集、permissionMode、qualityGatePolicy、memory policy、doctor() 和 runBenchmarks()，不要只依赖 prompt 约束。

import { doctor, run, runBenchmarks } from "clavue-agent-sdk";

const health = await doctor({
  toolsets: ["repo-readonly"],
  memory: { enabled: true },
});
if (health.status === "error") throw new Error("SDK runtime is not ready");

const result = await run({
  prompt: "Review the current package and report release blockers.",
  options: {
    toolsets: ["repo-readonly"],
    permissionMode: "default",
    memory: { enabled: true, policy: { mode: "brainFirst" } },
    quality_gates: [{ name: "release-review", status: "passed" }],
    qualityGatePolicy: { required: ["release-review"] },
    maxTurns: 6,
  },
});

if (result.subtype === "error_quality_gate_failed") {
  throw new Error(result.errors?.join("\n") || "Required quality gate failed");
}

const benchmarks = await runBenchmarks({ iterations: 3 });
console.log(benchmarks.metrics);

Current memory trace records policy, query, repo path, selected memory IDs, selected memory score/reason metadata, source/scope/confidence, validation state, retrieval steps, injected count, and whether retrieval happened before the first model call.

当前 memory trace 会记录 policy、query、repo path、selected memory IDs、被选记忆的分数和原因、source/scope/confidence、validation state、retrieval steps、injected count，以及是否在首次模型调用前完成检索。

The current capability upgrade program is tracked in docs/agent-sdk-capability-upgrade-program.md. It expands the SDK beyond coding automation into collection, organization, planning, problem solving, memory intelligence, skill creation, self-learning, reusable agents, and workflow templates.

当前能力升级计划见 docs/agent-sdk-capability-upgrade-program.md。它会把 SDK 从代码自动化扩展到资料收集、整理、规划、问题解决、记忆智能、技能创建、自学习、可复用 agent 和工作流模板。

Keep prompts operational / 让 Prompt 面向执行

Good prompts specify the goal, boundaries, expected output format, and verification command. Avoid broad prompts that mix unrelated work.

好的 prompt 应包含目标、边界、期望输出格式和验证命令。避免把多个无关任务混在一个过大的 prompt 里。

Good: Review src/providers/openai.ts for cancellation bugs. Do not edit files. Return findings with file:line references.
Good: Update README quick-start examples only. Run npm run build after editing.
Avoid: Make the project better.

Recommended production pattern / 推荐生产集成模式

Store credentials in environment variables, not source code.
Pin CLAVUE_AGENT_MODEL or pass model in code for predictable behavior.
Use allowedTools or toolsets for every automated workflow.
Set maxTurns for bounded execution.
Log the final AgentRunResult metadata: status, subtype, num_turns, usage, duration_ms, and total_cost_usd.
Enable selfImprovement only for workflows where persisting run lessons is expected.
Close reusable agents with await agent.close() so sessions, MCP connections, and memory hooks flush cleanly.
凭证放在环境变量中，不要写进源码。
通过 CLAVUE_AGENT_MODEL 或代码里的 model 固定模型，保证行为可预测。
每个自动化流程都设置 allowedTools 或 toolsets。
设置 maxTurns，避免无界运行。
记录 AgentRunResult 元数据：status、subtype、num_turns、usage、duration_ms、total_cost_usd。
只有在确实希望持久化运行经验时才开启 selfImprovement。
可复用 agent 使用完后调用 await agent.close()，确保 session、MCP 连接和 memory hooks 正常收尾。

Common recipes / 常用方法

# Explain a repository / 解释仓库结构
npx clavue-agent-sdk "Explain this repository architecture" --toolset repo-readonly

# Review a pull-request checkout / 审查当前 PR 工作区
npx clavue-agent-sdk "Review the current diff for bugs and release risks" --toolset repo-readonly

# Generate a machine-readable report / 生成机器可读报告
npx clavue-agent-sdk "Return JSON listing package release blockers" --toolset repo-readonly --json

1. Install as a library / 作为库安装

npm install clavue-agent-sdk

2. Configure / 配置

Set the environment variables once, then start using the SDK immediately.

先设置环境变量，然后就可以直接开始调用 SDK。

export CLAVUE_AGENT_API_KEY=your-api-key
# Optional / 可选
# export CLAVUE_AGENT_MODEL=claude-sonnet-4-6

OpenAI-compatible setup / OpenAI 兼容模型配置

export CLAVUE_AGENT_API_TYPE=openai-completions
export CLAVUE_AGENT_API_KEY=sk-...
export CLAVUE_AGENT_BASE_URL=https://api.openai.com/v1
export CLAVUE_AGENT_MODEL=gpt-4o

Anthropic-compatible gateway setup / Anthropic 兼容网关配置

export CLAVUE_AGENT_BASE_URL=https://openrouter.ai/api
export CLAVUE_AGENT_API_KEY=sk-or-...
export CLAVUE_AGENT_MODEL=anthropic/claude-sonnet-4

3. Easiest integration for another program / 其他程序最简单集成方式

If another Node.js service just needs one clear call, use run(). It creates an agent, executes the prompt, closes the agent, and returns a complete typed artifact.

如果其他 Node.js 服务只想用最简单的一次调用，使用 run()。它会创建 agent、执行 prompt、关闭 agent，并返回完整的类型化结果。

import { run } from "clavue-agent-sdk";

const result = await run({
  prompt: "Read package.json and return the name and version as JSON.",
  options: {
    cwd: process.cwd(),
    allowedTools: ["Read"],
    maxTurns: 3,
  },
});

if (result.status !== "completed") {
  throw new Error(result.errors?.join("\n") || result.subtype);
}

console.log(result.text);

run() returns AgentRunResult: status, subtype, final text, events, messages, usage, num_turns, duration_ms, duration_api_ms, total_cost_usd, timestamps, optional errors, and optional self_improvement artifacts when enabled.

run() 返回 AgentRunResult：包含 status、subtype、最终 text、events、messages、usage、num_turns、耗时、费用、时间戳、可选 errors，以及启用时返回的可选 self_improvement 结果。

4. Streaming events / 流式事件

Use query() when your program wants live events: assistant text, tool calls, tool results, and the final result.

当你的程序需要实时事件流时使用 query()：包括 assistant 文本、工具调用、工具结果和最终结果。

import { query } from "clavue-agent-sdk";

for await (const message of query({
  prompt: "Read package.json and tell me the project name.",
  options: {
    allowedTools: ["Read", "Glob"],
  },
})) {
  if (message.type === "assistant") {
    for (const block of message.message.content) {
      if ("text" in block) console.log(block.text);
    }
  }

  if (message.type === "result") {
    console.log(`Done in ${message.num_turns} turns`);
  }
}

5. Reusable agent / 可复用 Agent

Use createAgent() when your application needs multi-turn state, session persistence, MCP connections, hooks, or repeated calls.

当你的应用需要多轮上下文、会话持久化、MCP 连接、hooks 或重复调用时，使用 createAgent()。

import { createAgent } from "clavue-agent-sdk";

const agent = createAgent({ model: "claude-sonnet-4-6" });
try {
  const result = await agent.prompt("What files are in this project?");

  console.log(result.text);
  console.log(
    `Turns: ${result.num_turns}, Tokens: ${result.usage.input_tokens + result.usage.output_tokens}`,
  );
} finally {
  await agent.close();
}

6. OpenAI / GPT models

import { createAgent } from "clavue-agent-sdk";

const agent = createAgent({
  apiType: "openai-completions",
  model: "gpt-4o",
  apiKey: "sk-...",
  baseURL: "https://api.openai.com/v1",
});

const result = await agent.prompt("What files are in this project?");
console.log(result.text);

The apiType is auto-detected from model name — models containing gpt-, o1, o3, deepseek, qwen, mistral, etc. automatically use openai-completions.

apiType 也可以根据模型名自动推断：包含 gpt-、o1、o3、deepseek、qwen、mistral 等关键字时，会自动选择 openai-completions。

7. Web demo / Web 演示

npm run web
# Open http://localhost:8081

Use this when you want a fast local sandbox for prompt-tool behavior and event streaming.

如果你想快速验证 prompt、tool 调用和事件流，这个本地 Web 演示是最快的入口。

More examples / 更多示例

Multi-turn conversation

import { createAgent } from "clavue-agent-sdk";

const agent = createAgent({ maxTurns: 5 });

const r1 = await agent.prompt(
  'Create a file /tmp/hello.txt with "Hello World"',
);
console.log(r1.text);

const r2 = await agent.prompt("Read back the file you just created");
console.log(r2.text);

console.log(`Session messages: ${agent.getMessages().length}`);

Custom tools (Zod schema)

import { z } from "zod";
import { query, tool, createSdkMcpServer } from "clavue-agent-sdk";

const getWeather = tool(
  "get_weather",
  "Get the temperature for a city",
  { city: z.string().describe("City name") },
  async ({ city }) => ({
    content: [{ type: "text", text: `${city}: 22°C, sunny` }],
  }),
);

const server = createSdkMcpServer({ name: "weather", tools: [getWeather] });

for await (const msg of query({
  prompt: "What is the weather in Tokyo?",
  options: { mcpServers: { weather: server } },
})) {
  if (msg.type === "result")
    console.log(`Done: $${msg.total_cost_usd?.toFixed(4)}`);
}

Custom tools (low-level)

import {
  createAgent,
  getAllBaseTools,
  defineTool,
} from "clavue-agent-sdk";

const calculator = defineTool({
  name: "Calculator",
  description: "Evaluate a math expression",
  inputSchema: {
    type: "object",
    properties: { expression: { type: "string" } },
    required: ["expression"],
  },
  isReadOnly: true,
  async call(input) {
    const result = Function(`'use strict'; return (${input.expression})`)();
    return `${input.expression} = ${result}`;
  },
});

const agent = createAgent({ tools: [...getAllBaseTools(), calculator] });
const r = await agent.prompt("Calculate 2**10 * 3");
console.log(r.text);

Skills

Skills are reusable executable workflows that extend agent capabilities. Bundled skills include coding/review helpers such as simplify, commit, review, debug, and test, plus lifecycle workflows such as define, plan, build, verify, workflow-review, ship, and repair.

import {
  createAgent,
  registerSkill,
  getAllSkills,
} from "clavue-agent-sdk";

// Register a custom skill
registerSkill({
  name: "explain",
  description: "Explain a concept in simple terms",
  userInvocable: true,
  async getPrompt(args) {
    return [
      {
        type: "text",
        text: `Explain in simple terms: ${args || "Ask what to explain."}`,
      },
    ];
  },
});

console.log(`${getAllSkills().length} skills registered`);

// The model can invoke skills via the Skill tool
const agent = createAgent();
const result = await agent.prompt('Use the "explain" skill to explain git rebase');
console.log(result.text);

Skills can also run in a forked subagent context by setting context: "fork". Forked skills create durable background AgentJobs, inherit the parent provider and permission policy, apply skill-level model and allowedTools, and preserve the subagent trace, evidence, and quality_gates on the final job record.

import {
  SkillTool,
  getAgentJob,
  registerAgents,
  registerSkill,
} from "clavue-agent-sdk";

registerAgents({
  reviewer: {
    description: "Specialized review agent",
    prompt: "Review carefully and produce concise findings.",
    tools: ["Read", "Glob", "Grep"],
  },
}, { runtimeNamespace: "docs-forked-skill" });

registerSkill({
  name: "deep-review",
  description: "Run a durable background code review",
  context: "fork",
  agent: "reviewer",
  allowedTools: ["Read", "Glob", "Grep"],
  model: "gpt-5.4",
  userInvocable: true,
  async getPrompt(args) {
    return [{ type: "text", text: `Review this target: ${args}` }];
  },
}, { runtimeNamespace: "docs-forked-skill" });

const result = await SkillTool.call(
  { skill: "deep-review", args: "src/agent.ts" },
  {
    cwd: process.cwd(),
    runtimeNamespace: "docs-forked-skill",
    model: "gpt-5.4",
    provider,
  },
);

const { job_id } = JSON.parse(String(result.content));
const job = await getAgentJob(job_id, { runtimeNamespace: "docs-forked-skill" });
console.log(job?.status, job?.trace, job?.evidence, job?.quality_gates);

Self-improvement memory

Enable selfImprovement when you want each structured run to capture reusable operational lessons for future runs. It is opt-in and stores bounded improvement memories after Agent.run() / top-level run() completes.

import { createAgent, queryMemories } from "clavue-agent-sdk";

const agent = createAgent({
  cwd: process.cwd(),
  memory: {
    enabled: true,
    autoInject: true,
    repoPath: process.cwd(),
  },
  selfImprovement: {
    memory: {
      repoPath: process.cwd(),
      maxEntriesPerRun: 4,
    },
  },
});

try {
  const run = await agent.run("Verify the package release is ready.");
  console.log(run.self_improvement?.savedMemories.length ?? 0);

  const lessons = await queryMemories({
    repoPath: process.cwd(),
    type: "improvement",
    text: "package release verification",
    limit: 5,
  });
  console.log(lessons.map((lesson) => lesson.title));
} finally {
  await agent.close();
}

By default this captures failed tool-result signals and terminal run failures. Successful run patterns are only saved when selfImprovement.memory.captureSuccessfulRuns is explicitly enabled. Captured text is trimmed, common API keys and bearer tokens are redacted, and future runs must still verify current repo state before applying a remembered lesson.

默认只捕获工具失败信号和 run 终态失败；只有显式设置 captureSuccessfulRuns 时才会记录成功模式。记录内容会裁剪并脱敏常见 API key / bearer token，未来 run 使用这些经验前仍需要验证当前仓库状态。

You can combine run learning with the deterministic retro/eval cycle, and optionally allow a bounded retry loop guarded by verification gates:

const run = await agent.run("Improve this SDK safely.", {
  selfImprovement: {
    memory: { repoPath: process.cwd() },
    retro: {
      enabled: true,
      targetName: "clavue-agent-sdk",
      gates: [
        { name: "build", command: "npm", args: ["run", "build"] },
        { name: "test", command: "npm", args: ["test"] },
      ],
      loop: {
        enabled: true,
        maxAttempts: 3,
        retryPrompt: "Fix the highest-priority verified issue, then stop.",
      },
    },
  },
});

console.log(run.self_improvement?.retroLoop?.summary.completedAttempts);
console.log(run.self_improvement?.retroCycle?.summary.statusLine);

Nested retry runs automatically disable nested selfImprovement capture to keep the loop bounded. retroCycle always points at the final cycle for compatibility; retroLoop contains every cycle and retry lineage when loop mode is enabled.

Exported helpers: extractRunImprovementCandidates(run, config, options) for dry-run extraction and runSelfImprovement(run, config, options) for direct persistence/retro orchestration.

Retro / eval core

Run a deterministic engine-level evaluation loop and get structured findings, scores, and upgrade workstreams. createDefaultRetroEvaluators() inspects package/import/build/test/onboarding readiness across the four core dimensions:

import {
  createDefaultRetroEvaluators,
  runRetroEvaluation,
} from "clavue-agent-sdk";

const evaluators = createDefaultRetroEvaluators();

const result = await runRetroEvaluation({
  target: { name: "my-project", cwd: process.cwd() },
  evaluators,
});

console.log(result.scores.overall.score);
console.log(result.proposed_workstreams);

Run the full retro cycle in one call:

import {
  createDefaultRetroEvaluators,
  runRetroCycle,
} from "clavue-agent-sdk";

const cycle = await runRetroCycle({
  target: { name: "my-project", cwd: process.cwd() },
  evaluators: createDefaultRetroEvaluators(),
  gates: [
    { name: "build", command: "npm", args: ["run", "build"] },
    { name: "test", command: "npm", args: ["test"] },
  ],
  runId: "run-current",
  previousRunId: "run-previous",
  policy: { maxAttempts: 3 },
});

console.log(cycle.run.summary);
console.log(cycle.verification?.summary);
console.log(cycle.action.kind);
console.log(cycle.decision.disposition); // accepted | rejected | retry
console.log(cycle.summary.statusLine);
console.log(cycle.summary.text);

Or use the built-in defaults with just a target:

import { runRetroCycle } from "clavue-agent-sdk";

const cycle = await runRetroCycle({
  target: { name: "my-project", cwd: process.cwd() },
});

console.log(cycle.verification?.gates.map((gate) => gate.name)); // ["build", "test"]

Persist a run for later comparison:

import {
  compareRetroRuns,
  loadRetroCycle,
  loadRetroRun,
  saveRetroCycle,
  saveRetroRun,
} from "clavue-agent-sdk";

await saveRetroRun("run-2026-04-14", result);
await saveRetroCycle("cycle-2026-04-14", cycle);
const previous = await loadRetroRun("run-2026-04-13");
const previousCycle = await loadRetroCycle("cycle-2026-04-13");

if (previous) {
  const drift = compareRetroRuns(previous, result);
  console.log(drift.scoreDeltas.overall.delta);
  console.log(drift.newFindings);
}

console.log(previousCycle?.decision.disposition);

Run fixed quality gates before or after a retro pass:

import { runRetroVerification } from "clavue-agent-sdk";

const verification = await runRetroVerification({
  target: { name: "my-project", cwd: process.cwd() },
  gates: [
    { name: "build", command: "npm", args: ["run", "build"] },
    { name: "test", command: "npm", args: ["test"] },
  ],
});

console.log(verification.passed);
console.log(verification.gates);

Decide the next machine action from retro state:

import {
  compareRetroRuns,
  decideRetroAction,
  loadRetroRun,
  runRetroEvaluation,
  runRetroVerification,
  saveRetroRun,
} from "clavue-agent-sdk";

const verification = await runRetroVerification({
  target: { name: "my-project", cwd: process.cwd() },
});

const current = await runRetroEvaluation({
  target: { name: "my-project", cwd: process.cwd() },
  evaluators,
});

const previous = await loadRetroRun("run-previous");
const comparison = previous ? compareRetroRuns(previous, current) : undefined;
const action = decideRetroAction({
  run: current,
  verification,
  previousRun: previous ?? undefined,
  comparison,
  attemptCount: 0,
  policy: { maxAttempts: 3 },
});

await saveRetroRun("run-current", current);
console.log(verification.summary);
console.log(action.kind);

Hooks (lifecycle events)

import { createAgent, createHookRegistry } from "clavue-agent-sdk";

const hooks = createHookRegistry({
  PreToolUse: [
    {
      handler: async (input) => {
        console.log(`About to use: ${input.toolName}`);
        // Return { block: true } to prevent tool execution
      },
    },
  ],
  PostToolUse: [
    {
      handler: async (input) => {
        console.log(`Tool ${input.toolName} completed`);
      },
    },
  ],
});

20 lifecycle events: PreToolUse, PostToolUse, PostToolUseFailure, SessionStart, SessionEnd, Stop, SubagentStart, SubagentStop, UserPromptSubmit, PermissionRequest, PermissionDenied, TaskCreated, TaskCompleted, ConfigChange, CwdChanged, FileChanged, Notification, PreCompact, PostCompact, TeammateIdle.

MCP server integration

import { createAgent } from "clavue-agent-sdk";

const agent = createAgent({
  mcpServers: {
    filesystem: {
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
    },
  },
});

const result = await agent.prompt("List files in /tmp");
console.log(result.text);
await agent.close();

Subagents

import { query } from "clavue-agent-sdk";

for await (const msg of query({
  prompt: "Use the code-reviewer agent to review src/index.ts",
  options: {
    agents: {
      "code-reviewer": {
        description: "Expert code reviewer",
        prompt: "Analyze code quality. Focus on security and performance.",
        tools: ["Read", "Glob", "Grep"],
      },
    },
  },
})) {
  if (msg.type === "result") console.log("Done");
}

Durable background AgentJobs

Use AgentTool with run_in_background: true when a subagent should continue without blocking the parent turn. The tool returns a durable job envelope immediately:

{
  "success": true,
  "type": "clavue.agent.job",
  "version": 1,
  "job_id": "agent_job_...",
  "status": "queued"
}

The job is persisted under the current runtime namespace, stores final output, trace, evidence, quality gates, errors, and heartbeat status, and can be inspected or cancelled through tools or SDK APIs.

import {
  AgentTool,
  AgentJobListTool,
  AgentJobGetTool,
  AgentJobStopTool,
  getAgentJob,
  listAgentJobs,
} from "clavue-agent-sdk";

const context = {
  cwd: process.cwd(),
  runtimeNamespace: "docs-background-demo",
  model: "gpt-5.4",
  provider,
};

const started = await AgentTool.call({
  prompt: "Review src/ for security risks.",
  description: "security review",
  subagent_type: "Explore",
  run_in_background: true,
}, context);

const { job_id } = JSON.parse(String(started.content));
console.log(await listAgentJobs({ runtimeNamespace: context.runtimeNamespace }));
console.log(await getAgentJob(job_id, { runtimeNamespace: context.runtimeNamespace }));

await AgentJobListTool.call({}, context);
await AgentJobGetTool.call({ id: job_id }, context);
await AgentJobStopTool.call({ id: job_id, reason: "no longer needed" }, context);

Exported helpers include createAgentJob(), getAgentJob(), listAgentJobs(), stopAgentJob(), clearAgentJobs(), and the public types AgentJobRecord, AgentJobStatus, AgentJobKind, AgentJobCompletion, AgentJobStoreOptions, and CreateAgentJobInput.

AgentJob storage defaults to ~/.clavue-agent-sdk/agent-jobs; set CLAVUE_AGENT_JOBS_DIR or pass AgentJobStoreOptions.dir to isolate stores in tests or multi-tenant hosts.

Permissions and tool execution safety

import { query } from "clavue-agent-sdk";

// Trusted automation is the default; restrict tools for a read-only agent.
for await (const msg of query({
  prompt: "Review the code in src/ for best practices.",
  options: {
    toolsets: ["repo-readonly"],
    disallowedTools: ["WebSearch"],
    canUseTool: async (tool, input) => {
      if (tool.name === "Read") return { behavior: "allow" };
      return { behavior: "allow", updatedInput: input };
    },
  },
})) {
  // ...
}

Tool access is controlled in layers: toolsets and allowedTools choose the available tool names, disallowedTools removes names last, canUseTool can deny or rewrite a specific tool input, and hooks can block lifecycle events. Subagents inherit the parent permission policy.

工具访问按层控制：toolsets 和 allowedTools 选择可用工具名，disallowedTools 最后移除工具名，canUseTool 可以拒绝或改写单次工具输入，hooks 可以拦截生命周期事件。Subagent 会继承父 agent 的权限策略。

permissionMode also has built-in semantics. default allows read-only tools only. plan freezes mutating tools while allowing planning/read tools. acceptEdits allows local file edits but blocks shell, network, external-state, destructive, or approval-required tools. trustedAutomation and bypassPermissions are high-trust modes; still use allowedTools, disallowedTools, and canUseTool for least privilege.

permissionMode 也有内置语义。default 只允许只读工具。plan 会冻结修改型工具，同时允许规划和读取工具。acceptEdits 允许本地文件编辑，但会阻止 shell、网络、外部状态、破坏性或需要审批的工具。trustedAutomation 和 bypassPermissions 是高信任模式；生产环境仍建议配合 allowedTools、disallowedTools 和 canUseTool 做最小权限控制。

Low-confirmation development mode

Use autonomyMode: "autonomous" when the user has already authorized a development task and wants the agent to inspect, edit, verify, and repair without routine confirmation prompts. This changes initiative and question-asking behavior only; it does not bypass permissionMode, tool filters, hooks, or host canUseTool.

import { run } from "clavue-agent-sdk";

const result = await run({
  prompt: "Resolve the P0-P3 todo list, fix failures, and run verification.",
  options: {
    cwd: process.cwd(),
    model: "gpt-5.5",
    toolsets: ["repo-edit"],
    allowedTools: ["Bash"],
    permissionMode: "trustedAutomation",
    autonomyMode: "autonomous",
    maxTurns: 16,
  },
});

console.log(result.trace?.policy_decisions);

CLI equivalent:

CLAVUE_AGENT_AUTONOMY=autonomous \
CLAVUE_AGENT_PERMISSION_MODE=trustedAutomation \
npx clavue-agent-sdk "Fix the P0-P3 todo list and verify" \
  --toolset repo-edit \
  --allow Bash \
  --json

For safer local-edit-only automation, combine autonomyMode: "autonomous" with permissionMode: "acceptEdits" and omit shell/network tools. Run traces include policy_decisions for both allows and denials, with a safe input summary instead of raw tool input, plus the backward-compatible permission_denials list.

Local issue workflows

Use the issue workflow when you want a bounded builder, reviewer, fixer, and verifier loop around a concrete bug report or todo item. issue run creates the workflow record and background jobs without executing the full loop. issue execute runs the local workflow loop immediately.

# Create a workflow from inline text / 从内联文本创建 workflow
npx clavue-agent-sdk issue run "Fix provider retry handling for 429 responses" \
  --passing-score 85 \
  --require-gate tests \
  --json

# Execute from a local markdown issue / 从本地 markdown issue 执行
npx clavue-agent-sdk issue execute .clavue/issues/p0-provider-retry.md \
  --max-iterations 3 \
  --passing-score 90 \
  --require-gate build,tests \
  --json

# Inspect and stop workflow runs / 查看和停止 workflow run
npx clavue-agent-sdk issue list --json
npx clavue-agent-sdk issue get issue_run_... --json
npx clavue-agent-sdk issue stop issue_run_... --json

Programmatic usage:

import { normalizeIssueInput, runIssueWorkflow } from "clavue-agent-sdk";

const workflow = await runIssueWorkflow({
  cwd: process.cwd(),
  issue: normalizeIssueInput("Fix flaky package payload verification."),
  maxIterations: 3,
  passingScore: 90,
  requiredGates: ["build", "tests"],
});

console.log(workflow.status, workflow.finalScore);
console.log(workflow.proof_of_work.status, workflow.proof_of_work.verification);

Issue workflow records are stored under ~/.clavue-agent-sdk/issue-runs by default. Use the SDK store options to isolate runs for tests, CI, or multi-tenant hosts. runIssueWorkflow() returns proof_of_work, so hosts get a standard handoff artifact without the SDK owning GitHub, PR, CI, Linear, or Jira integrations.

Workflow contracts, proof of work, and orchestration policy

For host applications that want Symphony-style discipline without coupling the SDK to a tracker or daemon, use the SDK-core workflow primitives:

import {
  createProofOfWork,
  loadWorkflowDefinition,
  renderWorkflowPrompt,
  resolveWorkflowServiceConfig,
  selectDispatchCandidates,
  validateWorkflowDispatchConfig,
} from "clavue-agent-sdk";

const definition = await loadWorkflowDefinition({ cwd: repoPath });
const config = resolveWorkflowServiceConfig(definition);
const configIssues = validateWorkflowDispatchConfig(config, { requireTracker: false });
if (configIssues.length > 0) throw new Error(configIssues[0]!.message);

const selection = selectDispatchCandidates({
  config,
  issues: [{
    id: "issue-42",
    identifier: "SDK-42",
    title: "Fix autonomous workflow handoff",
    state: "Todo",
    priority: 1,
  }],
});

const prompt = renderWorkflowPrompt(definition, {
  issue: {
    identifier: selection.selected[0]?.identifier,
    title: selection.selected[0]?.title,
    description: "Produce a tested SDK-core implementation and proof of work.",
  },
});

const proof = createProofOfWork({
  target: { kind: "issue", id: "SDK-42", title: "Fix autonomous workflow handoff" },
  evidence: [{ type: "test", summary: "Focused verification passed", source: "external" }],
  quality_gates: [{ name: "tests", status: "passed" }],
  required_gates: ["tests"],
  references: [{ type: "issue", label: "Host issue", url: "https://tracker.example/SDK-42" }],
});

console.log(prompt);
console.log(proof.status, proof.handoff);

The SDK standardizes the contract, proof, and policy layers. Your host application still owns task polling, external tracker updates, PR creation, CI execution, dashboards, and worker lifecycle.

Runtime profiles

Runtime profiles turn a high-level workflow mode into concrete toolsets, permission mode, memory policy, autonomy mode, prompt guidance, and quality-gate behavior. This is the recommended path for hosts that want consistent behavior across collect, organize, plan, solve, build, verify, review, and ship flows.

import { getAllRuntimeProfiles, run } from "clavue-agent-sdk";

console.log(getAllRuntimeProfiles().map((profile) => profile.name));

const result = await run({
  prompt: "Verify this package is ready to publish.",
  options: {
    workflowMode: "verify",
    cwd: process.cwd(),
    maxTurns: 6,
  },
});

console.log(result.status, result.trace?.policy_decisions);

The engine only parallelizes tool calls when a tool declares both isReadOnly() and isConcurrencySafe(). Mutating tools and read-only tools that are not concurrency-safe run serially. Set maxToolConcurrency per run to cap safe parallel batches; when omitted, AGENT_SDK_MAX_TOOL_CONCURRENCY is used as the fallback. Invalid, zero, or negative values fall back to 10 so runs do not hang. Run traces include tool_concurrency_limit, tool_concurrency_source, and the existing concurrency_batches.

引擎只会并行执行同时声明 isReadOnly() 与 isConcurrencySafe() 的工具调用。会修改状态的工具，以及只读但非并发安全的工具，会串行执行。可通过每次运行的 maxToolConcurrency 限制安全并行批次；未设置时回退使用 AGENT_SDK_MAX_TOOL_CONCURRENCY。无效、零或负数会回退到 10，避免运行卡住。运行 trace 会包含 tool_concurrency_limit、tool_concurrency_source 和已有的 concurrency_batches。

Provider retries and tolerance

Provider calls automatically retry transient API and network failures with exponential backoff. Retryable conditions include rate limits, common 5xx/overload statuses, fetch/socket failures, and Retry-After headers; abort signals are honored during backoff.

Provider 调用会对临时 API 和网络失败自动指数退避重试。可重试场景包括限流、常见 5xx/overload 状态、fetch/socket 失败以及 Retry-After 响应头；退避等待期间会响应 abort signal。

For OpenAI-compatible GPT-5 models, the SDK uses the Responses API by default and falls back to Chat Completions when a gateway does not support /responses. Incomplete Responses output caused by output-token limits maps to max_tokens so the engine can continue; failed or cancelled Responses runs surface as errors instead of empty text.

对于 OpenAI 兼容的 GPT-5 模型，SDK 默认使用 Responses API；如果网关不支持 /responses，会回退到 Chat Completions。因输出 token 限制导致的 incomplete Responses 会映射为 max_tokens，方便引擎继续；failed 或 cancelled 的 Responses 会以错误暴露，而不是返回空文本。

Web UI

A built-in web chat interface is included for testing:

npx tsx examples/web/server.ts
# Open http://localhost:8081

API reference

Which API should I use? / 应该使用哪个 API？

| Need / 需求 | Use / 使用 | | ----------- | ---------- | | Terminal or CI one-off task / 终端或 CI 一次性任务 | npx clavue-agent-sdk "prompt" | | Simplest Node.js integration / 最简单 Node.js 集成 | run({ prompt, options }) | | Streaming UI or progress logs / 流式 UI 或进度日志 | query({ prompt, options }) | | Multi-turn service, sessions, MCP, hooks / 多轮服务、会话、MCP、hooks | createAgent(options) |

Program logic / 程序逻辑

Your app calls run(), query(), or a reusable agent.prompt() / agent.query().
The SDK builds the system context from options, repo context files, git status, tools, MCP servers, skills, hooks, and permission policy.
The provider layer sends normalized messages and tool schemas to Anthropic Messages or an OpenAI-compatible chat endpoint.
When the model requests a tool, the engine applies allow/deny filters, canUseTool, permission mode, and hooks, then executes the tool.
Tool results are appended to the conversation and the engine repeats until the provider returns a final answer or the run reaches limits.
The SDK returns either streaming SDKMessage events or a structured AgentRunResult artifact, reusable agents can persist sessions under ~/.clavue-agent-sdk, and background AgentJobs persist under ~/.clavue-agent-sdk/agent-jobs.

Top-level functions

| Function | ------------------------------------- | run({ prompt, options }) | query({ prompt, options }) | createAgent(options) | tool(name, desc, schema, handler) | createSdkMcpServer({ name, | defineTool(config) | doctor(options) | runBenchmarks(options) | getAllBaseTools() | registerSkill(definition) | getAllSkills() | createAgentJob(input, opts) | getAgentJob(id, opts) | listAgentJobs(opts) | stopAgentJob(id, reason, opts) | clearAgentJobs(opts) | runSelfImprovement(run, | extractRunImprovementCandidates(run, | runRetroEvaluation(input) | createDefaultRetroEvaluators() | compareRetroRuns(previous, | decideRetroAction(input) | runRetroVerification(input) | runRetroCycle(input) | saveRetroRun(runId, result, opts) | loadRetroRun(runId, opts) | saveRetroCycle(cycleId, | loadRetroCycle(cycleId, opts) | normalizeIssueInput(input, | createIssueWorkflowRun(input, | runIssueWorkflow(input, opts) | listIssueWorkflowRuns(opts) | loadIssueWorkflowRun(id, opts) | stopIssueWorkflowRun(id, | loadWorkflowDefinition(opts) | renderWorkflowPrompt(def, input) | resolveWorkflowServiceConfig(def) | validateWorkflowDispatchConfig(config) | selectDispatchCandidates(input) | calculateRetryDelayMs(input) | shouldReleaseIssueForState(state, | createProofOfWork(input) | getRuntimeProfile(mode) | getAllRuntimeProfiles() | applyRuntimeProfile(options) | normalizeFindings(findings) | scoreFindings(findings) | planUpgrades(findings) | createProvider(apiType, opts) | createHookRegistry(config) | listSessions() | forkSession(id) | Description | | -------------------------------------------------------------- | | One-shot blocking run, returns Promise<AgentRunResult> | | One-shot streaming query, returns AsyncGenerator<SDKMessage> | | Create a reusable agent with session persistence | | Create a tool with Zod schema validation | tools }) | Bundle tools into an in-process MCP server | | Low-level tool definition helper | | Run structured provider, tool, skill, MCP, storage, and package checks | | Run offline benchmark metrics without live model calls | | Get all 35+ built-in tools | | Register a custom skill | | Get all registered skills | | Create a durable background agent job record | | Read a durable background job by ID | | List durable background jobs in a runtime namespace | | Cancel a queued or running background job | | Clear background jobs for a runtime namespace | config, opts) | Persist bounded improvement memories and optionally run retro/eval feedback | config, opts) | Inspect which improvement memories a run would generate | | Run deterministic retro/eval orchestration and return typed results | | Inspect package/import/build/test/onboarding readiness across the core dimensions | current) | Compare two retro runs for score deltas and finding drift | | Decide the next machine action from current retro state | | Run fixed quality gates and return pass/fail command results | | Run evaluation, verification, policy, comparison, and optional persistence in one call | | Persist a retro run result to the run ledger | | Load a persisted retro run result from the run ledger | result, opts) | Persist a full retro cycle result including decision and summary | | Load a persisted retro cycle result from the run ledger | source?) | Normalize inline or file-backed issue text into a workflow record | opts) | Create a durable local issue workflow with role-based AgentJobs | | Execute a bounded local builder/reviewer/fixer/verifier loop and return proof_of_work | | List persisted issue workflow runs | | Load one persisted issue workflow run | reason, opts) | Stop an issue workflow run and cancel its associated jobs | | Load a repository-owned WORKFLOW.md contract | | Strictly render an issue/task prompt from a workflow contract | | Resolve workflow defaults, env indirection, workspaces, and runtime settings | | Validate workflow config before dispatch | | Select eligible issues under active/terminal state and concurrency policy | | Compute continuation or capped exponential retry delay | config) | Decide whether an issue state should release a claim | | Create a standard proof-of-work handoff artifact | | Read a built-in workflow profile | | List built-in workflow profiles | | Expand workflowMode into concrete runtime options | | Normalize retro findings into a stable schema | | Compute per-dimension and overall retro scores | | Turn retro findings into prioritized workstreams | | Create an LLM provider directly | | Create a hook registry for lifecycle events | | List persisted sessions | | Fork a session for branching |

Agent methods

| Method | Description | | ------------------------------- | ----------------------------------------------------- | | agent.query(prompt) | Streaming query, returns AsyncGenerator<SDKMessage> | | agent.run(text, overrides) | Blocking run, returns full AgentRunResult including self_improvement when enabled | | agent.prompt(text) | Blocking query, returns Promise<QueryResult> | | agent.getMessages() | Get conversation history | | agent.clear() | Reset session | | agent.interrupt() | Abort current query | | agent.setModel(model) | Change model mid-session | | agent.setPermissionMode(mode) | Change permission mode | | agent.stopTask(id) | Stop a durable AgentJob by ID, then fall back to legacy task cancellation | | agent.getApiType() | Get current API type | | agent.close() | Close MCP connections, persist session |

Options

| Option | Type | Default | Description | | -------------------- | --------------------------------------- | ---------------------- | -------------------------------------------------------------------- | | apiType | string | auto-detected | 'anthropic-messages' or 'openai-completions' | | model | string | claude-sonnet-4-6 | LLM model ID | | apiKey | string | CLAVUE_AGENT_API_KEY | API key | | baseURL | string | — | Custom API endpoint | | cwd | string | process.cwd() | Working directory | | systemPrompt | string | — | System prompt override | | appendSystemPrompt | string | — | Append to default system prompt | | tools | ToolDefinition[] | All built-in | Available tools | | toolsets | ToolsetName[] | — | Named built-in tool groups | | allowedTools | string[] | — | Tool allow-list | | disallowedTools | string[] | — | Tool deny-list | | permissionMode | string | trustedAutomation | trustedAutomation / auto / default / acceptEdits / dontAsk / bypassPermissions / plan | | autonomyMode | string | inferred from permission/profile | supervised / proactive / autonomous; controls initiative and confirmations without bypassing permissions | | canUseTool | function | allow all | Custom tool guard or input modifier | | qualityGatePolicy | QualityGatePolicy | — | Mark a successful run as failed when required quality gates fail or are missing | | maxTurns | number | 10 | Max agentic turns | | maxToolConcurrency | number | env or 10 | Max concurrent read-only concurrency-safe tool calls per batch | | maxBudgetUsd | number | — | Spending cap | | thinking | ThinkingConfig | { type: 'adaptive' } | Extended thinking | | effort | string | high | Reasoning effort: low / medium / high / max | | mcpServers | Record<string, McpServerConfig> | — | MCP server connections | | agents | Record<string, AgentDefinition> | — | Subagent definitions | | hooks | Record<string, HookCallbackMatcher[]> | — | Lifecycle hooks | | memory | MemoryConfig | — | Structured memory injection, off / autoInject / brainFirst policy, and session-summary persistence | | selfImprovement | boolean \| SelfImprovementConfig | false | Opt-in run learning via improvement memories and optional retro cycle | | resume | string | — | Resume session by ID | | continue | boolean | false | Continue most recent session | | persistSession | boolean | true | Persist session to disk | | sessionId | string | auto | Explicit session ID | | outputFormat | { type: 'json_schema', schema } | — | Structured output | | sandbox | SandboxSettings | — | Filesystem/network sandbox | | settingSources | SettingSource[] | — | Load AGENT.md, project settings | | env | Record<string, string> | — | Environment variables | | abortController | AbortController | — | Cancellation controller |

Named toolsets

Use toolsets in the SDK or --toolset in the CLI to enable named groups of built-in tools without listing every tool name. The SDK also exports TOOLSET_NAMES, isToolsetName(), and getToolsetTools() for validation and UI generation.

在 SDK 中使用 toolsets，或在 CLI 中使用 --toolset，可以启用命名的内置工具组，而不必逐个列出工具名。SDK 也导出 TOOLSET_NAMES、isToolsetName() 和 getToolsetTools()，方便做校验或生成 UI。

import { TOOLSET_NAMES, getToolsetTools, isToolsetName, run } from "clavue-agent-sdk";

const selected = "repo-readonly";
if (!isToolsetName(selected)) throw new Error("Unknown toolset");

const result = await run({
  prompt: "Review this repository and check current docs.",
  options: {
    toolsets: [selected, "research"],
    disallowedTools: ["WebSearch"],
  },
});

console.log(TOOLSET_NAMES);
console.log(getToolsetTools([selected]));

| Toolset | Tools | | --------------- | --------------------------------------------------------------------- | | repo-readonly | Read, Glob, Grep | | repo-edit | Read, Write, Edit, Glob, Grep, NotebookEdit | | research | WebFetch, WebSearch | | planning | EnterPlanMode, ExitPlanMode, AskUserQuestion, TodoWrite | | tasks | TaskCreate, TaskList, TaskUpdate, TaskGet, TaskStop, TaskOutput | | automation | CronCreate, CronDelete, CronList, RemoteTrigger | | agents | Agent, AgentJobList, AgentJobGet, AgentJobStop, SendMessage, TeamCreate, TeamDelete | | mcp | ListMcpResources, ReadMcpResource | | skills | Skill |

toolsets are merged with allowedTools; disallowedTools is applied last and can remove tools from either source. For example, toolsets: ["repo-readonly"] plus allowedTools: ["WebFetch"] enables Read, Glob, Grep, and WebFetch; adding disallowedTools: ["Grep"] removes Grep.

toolsets 会与 allowedTools 合并；disallowedTools 最后应用，可以从任一来源移除工具。例如，toolsets: ["repo-readonly"] 加 allowedTools: ["WebFetch"] 会启用 Read、Glob、Grep 和 WebFetch；再加 disallowedTools: ["Grep"] 会移除 Grep。

Environment variables

| Variable | Description | | -------------------- | -------------------------------------------------------- | | CLAVUE_AGENT_API_KEY | API key (required) | | CLAVUE_AGENT_API_TYPE | anthropic-messages (default) or openai-completions | | CLAVUE_AGENT_MODEL | Default model override | | CLAVUE_AGENT_BASE_URL | Custom API endpoint | | CLAVUE_AGENT_AUTH_TOKEN | Alternative auth token | | CLAVUE_AGENT_JOBS_DIR | Override durable AgentJob storage directory | | AGENT_SDK_MAX_TOOL_CONCURRENCY | Max concurrent batch size for tools that are both read-only and concurrency-safe; invalid values fall back to 10 |

Built-in tools

Filesystem tools resolve paths relative to cwd but may access absolute paths when the host exposes them. For least privilege, combine cwd, toolsets, allowedTools/disallowedTools, canUseTool, and sandbox settings at the application boundary.

文件系统工具会相对 cwd 解析路径，但当宿主环境暴露绝对路径时也可能访问绝对路径。最小权限部署时，请在应用边界组合使用 cwd、toolsets、allowedTools/disallowedTools、canUseTool 与 sandbox 设置。

Session IDs are validated before disk access so persisted transcripts cannot escape the configured session store via absolute paths, .., or null-byte input. For multi-tenant hosts, also isolate session.dir, CLAVUE_AGENT_JOBS_DIR, and runtimeNamespace per tenant.

Session ID 在访问磁盘前会进行校验，持久化 transcript 不能通过绝对路径、.. 或空字节输入逃逸配置的 session store。多租户宿主还应为每个租户隔离 session.dir、CLAVUE_AGENT_JOBS_DIR 与 runtimeNamespace。

| Tool | Description | | ------------------------------------------ | -------------------------------------------- | | Bash | Execute shell commands | | Read | Read files with line numbers | | Write | Create / overwrite files | | Edit | Precise string replacement in files | | Glob | Find files by pattern | | Grep | Search file contents with regex | | WebFetch | Fetch and parse web content | | WebSearch | Search the web | | NotebookEdit | Edit Jupyter notebook cells | | Agent | Spawn subagents for parallel work | | AgentJobList/Get/Stop | Inspect and cancel durable background AgentJobs | | Skill | Invoke registered skills | | TaskCreate/List/Update/Get/Stop/Output | Task management system | | TeamCreate/Delete | Multi-agent team coordination | | SendMessage | Inter-agent messaging | | EnterWorktree/ExitWorktree | Git worktree isolation | | EnterPlanMode/ExitPlanMode | Structured planning workflow | | AskUserQuestion | Ask the user for input | | ToolSearch | Discover lazy-loaded tools | | ListMcpResources/ReadMcpResource | MCP resource access | | CronCreate/Delete/List | Scheduled task management | | RemoteTrigger | Remote agent triggers | | LSP | Language Server Protocol (code intelligence) | | Config | Dynamic configuration | | TodoWrite | Session todo list |

Bundled skills

| Skill | Description | | ------------ | -------------------------------------------------------------- | | simplify | Review changed code for reuse, quality, and efficiency | | commit | Create a git commit with a well-crafted message | | review | Review code changes for correctness, security, and performance | | debug | Systematic debugging using structured investigation | | test | Run tests and analyze failures | | define | Define goals, constraints, assumptions, and acceptance criteria | | plan | Produce an ordered implementation plan and verification strategy | | build | Implement scoped changes while preserving local patterns | | verify | Run targeted checks and report evidence | | workflow-review | Review lifecycle work for defects, risks, and missing evidence | | ship | Prepare a handoff or release summary with verification status | | repair | Diagnose and fix failed workflow outcomes with recovery evidence |

Architecture

┌──────────────────────────────────────────────────────┐
│                   Your Application                    │
│                                                       │
│   import { createAgent } from 'clavue-agent-sdk' │
└────────────────────────┬─────────────────────────────┘
                         │
              ┌──────────▼──────────┐
              │       Agent         │  Session state, tool pool,
              │  query() / prompt() │  MCP connections, hooks
              └──────────┬──────────┘
                         │
              ┌──────────▼──────────┐
              │    QueryEngine      │  Agentic loop:
              │   submitMessage()   │  API call → tools → repeat
              └──────────┬──────────┘
                         │
         ┌───────────────┼───────────────┐
         │               │               │
   ┌─────▼─────┐  ┌─────▼─────┐  ┌─────▼─────┐
   │  Provider  │  │  35 Tools │  │    MCP     │
   │ Anthropic  │  │ Bash,Read │  │  Servers   │
   │  OpenAI    │  │ Edit,...  │  │ stdio/SSE/ │
   │ DeepSeek   │  │ + Skills  │  │ HTTP/SDK   │
   └───────────┘  └───────────┘  └───────────┘

Key internals:

| Component | Description | | --------------------- | ------------------------------------------------------------------ | | Provider layer | Abstracts Anthropic / OpenAI API differences | | QueryEngine | Core agentic loop with auto-compact, retry, safe tool orchestration | | Skill system | Reusable executable workflows with bundled coding, review, test, and lifecycle skills | | Hook system | 20 lifecycle events integrated into the engine | | Auto-compact | Summarizes conversation when context window fills up | | Micro-compact | Truncates oversized tool results | | Retry | Exponential backoff for rate limits, transi