@tinydarkforge/arbiter
v0.1.1
Published
Deterministic guardrails for TypeScript AI agent loops. Tool-call validation, dollar-cost circuit breakers, loop detection. No LLM dependency.
Maintainers
Readme
╔═╦═╦═╦═╗ █████ █████ █ █ █ █ █████ █ █
║▣║▣║▣║▣║ █ █ █ █ █ ██ █ █ █ █
╠═╩═╩═╩═╣ █ ██ █████ █ █ █ █ █ █ █
║ ░░⊗░░ ║ █ █ █ █ █ █ █ ██ █ █ █
║ ▒▒▒▒▒ ║ █████ █ █ █████ █ █ █ █ █
╠═══════╣
║▓▓▓▓▓▓▓║ ━━━━━━━━━━ AGENT GAUNTLET ━━━━━━━━━━
╔═══╩═══════╩═══╗ Limits · Schema · Tools · Cost · Loops
║▓▓░░░░░░░░░░░▓▓║ — one guard, one verdict, sub-5ms.
║▓▓▒▒▒▒▒▒▒▒▒▒▒▓▓║ No LLM. MIT · No account · No tel.
╚═══════════════╝Arbiter is a deterministic safety gauntlet for TypeScript AI agent loops. Every step runs through one synchronous
check()— token & dollar budgets, tool-call rules, schema, repetition and cycle detection — and returnsok/retry/abortin under 5ms. No LLM in the path. No embeddings. No surprises.
Status:
v0.1.1— live on npm:@tinydarkforge/arbiter. Public API frozen for v0.1; reason codes stable.
░▒▓█ TL;DR
npm install @tinydarkforge/arbiterimport { createGuard } from "@tinydarkforge/arbiter";
const guard = createGuard("./arbiter.config.json");
const verdict = guard.check({ task_id, output, tool_calls, model, tokens_in, tokens_out, attempt });
if (verdict.status === "abort") throw new Error(verdict.reasons[0].code);
if (verdict.status === "retry") continue;arbiter does not generate. it judges.
░▒▓█ What it does today
Most AI guardrail tools protect a single LLM call. Arbiter protects the whole agent loop:
- Budget enforcement — cumulative token + dollar caps across N steps, per-step caps, per-task caps. Stops $50 runaway agents.
- Tool-call discipline — allowlist, per-tool arg schemas, mutex groups, blast-radius caps, sequence rules (
deployrequires priorrun_tests). - Output validation — JSON schema, length bounds, forbidden-pattern regex, deterministic — no model in the path.
- Loop detection — n-gram repetition on output, identical-tool-call detection, state-cycle detection. No embeddings.
- Retry budget — knows when to give up.
retry_exhaustedis an abort, not a hang.
One sync call. One verdict. Three statuses: ok, retry, abort.
░▒▓█ Guards
| Guard | Class | What it catches |
|----------------|----------------------|--------------------------------------------------------------------------|
| limits | budget | max_steps, per-step + cumulative tokens, output length out of range |
| forbidden | output | case-insensitive regex over output ("as an ai language model", etc.) |
| schema | output | Ajv on output JSON, on tool args |
| tool-calls | sequencing / safety | allowlist, mutex, blast radius, requires_prev sequencing |
| cost | budget | model + token → dollars via price table; max_dollars_per_task cap |
| loop | runaway | n-gram repetition, identical tool repeat, state cycles |
Every guard is deterministic, reproducible, and runs in under 5ms p95 against a 2 KB output with three tool calls. No external calls. No model dependency.
░▒▓█ Positioning
Arbiter is not an output parser, a content moderation classifier, or a dialog flow engine. It is a loop-level safety gate that sits between your agent and its next step.
| Alternative | When to pick it instead of Arbiter | |----------------------|-------------------------------------------------------------------------| | Guardrails AI | Single-call validation in Python, with managed validators. | | NeMo Guardrails | Conversational dialog flow with Colang. Different problem. | | instructor | Structured-output extraction in Python. Schema only, no loop guards. | | LangChain parsers| You already live in LangChain and only need output shape validation. | | Llama Guard / Lakera | Semantic safety classification (toxicity, jailbreaks). Use as classifier behind Arbiter (v0.3 hook). |
Arbiter's niche: TypeScript-native, sub-5ms, deterministic, multi-step, no LLM dependency. If you need a hosted classifier or dialog rails, use those tools — and put Arbiter in front to enforce the budget.
░▒▓█ Three ways teams use it
- Cost circuit breaker. Set
max_dollars_per_task: 0.50. Arbiter tracks model + tokens per step against a price table and aborts the loop the moment cumulative cost crosses the line. - Tool sandbox. Restrict which tools the agent can call, validate args per tool, enforce sequence rules (
deployrequires priorrun_tests), cap blast radius (write_filemax 5 per task). - Loop killer. When an agent spirals (same output repeated, same tool call retried, state revisited), Arbiter detects the pattern deterministically — no embeddings — and aborts before your bill or your database gets shredded.
░▒▓█ Prerequisites
Node.js >=20. Single runtime dep: ajv. No other native or network dependencies.
░▒▓█ Install
From npm
npm install @tinydarkforge/arbiterOne-shot CLI via npx
echo '{...}' | npx @tinydarkforge/arbiter check --config ./arbiter.config.jsonFrom source
git clone https://github.com/tinydarkforge/arbiter.git
cd arbiter
npm install
npm run build░▒▓█ 30-second quickstart
import { createGuard } from "@tinydarkforge/arbiter";
const guard = createGuard("./arbiter.config.json");
let attempt = 0;
while (!done) {
const { output, tool_calls, tokens_in, tokens_out } = await callModel(...);
const verdict = guard.check({
task_id: "task-1",
state: "execute",
output,
tool_calls,
model: "gpt-4o",
tokens_in,
tokens_out,
attempt,
});
if (verdict.status === "abort") throw new Error(`arbiter aborted: ${verdict.reasons[0].code}`);
if (verdict.status === "retry") { attempt++; continue; }
attempt = 0;
await executeToolCalls(tool_calls);
}A runnable demo loop hitting cost_cap, loop_repeat_tool, tool_blast_radius, and tool_sequence lives at examples/agent.js.
░▒▓█ Configuration
arbiter.config.json:
{
"limits": {
"max_steps": 20,
"max_tokens_per_step": 4000,
"max_total_tokens": 50000,
"output_min": 1,
"output_max": 10000
},
"forbidden_patterns": ["as an ai language model", "lorem ipsum"],
"tool_calls": {
"allowed": ["search", "read_file", "write_file", "run_tests", "deploy"],
"blast_radius": { "write_file": 5, "deploy": 1 },
"mutex": [["deploy", "run_tests"]],
"sequence": [{ "tool": "deploy", "requires_prev": "run_tests" }]
},
"cost": {
"prices": {
"gpt-4o": { "input_per_1m": 2.5, "output_per_1m": 10.0 },
"gpt-4o-mini": { "input_per_1m": 0.15, "output_per_1m": 0.6 },
"claude-sonnet-4-6": { "input_per_1m": 3.0, "output_per_1m": 15.0 },
"claude-haiku-4-5": { "input_per_1m": 1.0, "output_per_1m": 5.0 }
},
"max_dollars_per_task": 0.5
},
"loop_detection": {
"ngram_size": 5,
"max_repeats": 2,
"detect_identical_tool_calls": true,
"max_state_visits": 3
},
"retry": { "max_attempts": 2 },
"store": { "ttl_ms": 600000, "history_limit": 50 }
}Schema: schema/config.schema.json.
Field reference
| Field | Type | Default | Description |
|----------------------------------|-----------|-----------------|--------------------------------------------------------------------------|
| limits.max_steps | number | — | Hard cap on steps per task_id before max_steps abort |
| limits.max_tokens_per_step | number | — | Per-step token cap (sum of tokens_in + tokens_out) |
| limits.max_total_tokens | number | — | Cumulative token cap across the task |
| limits.output_min / output_max | number | — | Output length bounds — out of range = length_min / length_max retry |
| forbidden_patterns | string[] | [] | Regex sources, applied case-insensitive over output |
| tool_calls.allowed | string[] | — | Tool allowlist; anything else → tool_not_allowed abort |
| tool_calls.arg_schemas | object | {} | Per-tool JSON schema for args (Ajv compiled, cached) |
| tool_calls.mutex | string[][] | [] | Groups; calling two from the same group in one task → tool_mutex abort |
| tool_calls.blast_radius | object | {} | Per-tool max call count across the task |
| tool_calls.sequence | array | [] | { tool, requires_prev } rules; missing predecessor = tool_sequence |
| cost.prices | object | {} | Per-model input_per_1m / output_per_1m USD |
| cost.max_dollars_per_task | number | — | Cumulative cap; overflow → cost_cap abort |
| loop_detection.ngram_size | number | 5 | Token n-gram window over output history |
| loop_detection.max_repeats | number | 2 | Repeats before loop_repeat_output fires |
| loop_detection.detect_identical_tool_calls | boolean | true | Stable JSON of {name, args} vs immediate previous |
| loop_detection.max_state_visits| number | 3 | State-cycle threshold |
| retry.max_attempts | number | — | Retry budget; attempt > max_attempts → retry_exhausted abort |
| store.ttl_ms | number | 600000 | Per-task state TTL; reclaimed by gc() |
| store.history_limit | number | 50 | Ring buffer length for output / tool / state histories |
Unknown models in cost.prices are silently skipped (no abort, no cost accumulated). Missing config sections fall back to library defaults.
░▒▓█ API
createGuard(configOrPath): Guard
type Guard = {
check(input: CheckInput): CheckResult;
reset(task_id: string): void;
gc(ttl_ms?: number): number; // returns count of evicted task states
};CheckInput
type CheckInput = {
task_id: string;
step?: number; // optional; auto-incremented
state?: string; // for state-cycle detection
output?: string; // model text/JSON output
tool_calls?: { name: string; args: unknown; id?: string }[];
model?: string; // pricing key
tokens_in?: number;
tokens_out?: number;
attempt?: number; // for retry budget
};CheckResult
type CheckResult = {
status: "ok" | "retry" | "abort";
reasons: { code: ReasonCode; message: string; meta?: unknown }[];
metrics: {
steps: number;
total_tokens_in: number;
total_tokens_out: number;
total_dollars: number;
tool_counts: Record<string, number>;
elapsed_ms: number;
};
};Status semantics
| Status | Meaning |
|----------|-----------------------------------------------------------------------------|
| ok | Step accepted; state committed; continue the loop |
| retry | Step rejected, recoverable; state not committed; bump attempt + retry |
| abort | Step rejected, terminal; state not committed; tear the task down |
Aborts and retries never mutate task state. The store only commits on ok.
Reason codes
| Code | Class | Meaning |
|------|-------|---------|
| schema_invalid | retry | output failed JSON schema |
| forbidden_pattern | retry | regex hit in output |
| length_min / length_max| retry | output length out of range |
| tool_args_invalid | retry | tool args failed schema |
| max_steps | abort | step counter exceeded |
| max_tokens_step | abort | per-step token cap exceeded |
| max_tokens_total | abort | cumulative token cap exceeded |
| tool_not_allowed | abort | tool call outside allowlist |
| tool_sequence | abort | sequence rule violated |
| tool_mutex | abort | mutually exclusive tools called |
| tool_blast_radius | abort | tool called too many times |
| cost_cap | abort | dollar cap exceeded |
| loop_repeat_output | abort | n-gram repetition detected |
| loop_repeat_tool | abort | identical tool call repeated |
| loop_state_cycle | abort | state visited too often |
| retry_exhausted | abort | retry budget hit |
Reasons accumulate in order. The first reason determines the status (abort > retry > ok).
░▒▓█ CLI
arbiter check --config ./arbiter.config.json < input.json
# stdin: a single CheckInput object as JSON
# stdout: a CheckResult object as JSON
# exit 0 = ok, 1 = retry, 2 = abort, 3 = CLI / config errorarbiter --version
arbiter --helpThe CLI is intentionally thin — one in, one out, exit code carries the verdict. Wire it into any orchestrator that can pipe JSON.
░▒▓█ Performance
Measured on a 2 KB output with three tool calls, against the v0.1 reference config, on a single thread:
| Percentile | check() latency |
|-----------:|-------------------|
| p50 | ~0.003 ms |
| p95 | ~0.007 ms |
| p99 | ~0.015 ms |
CI gate enforces p95 < 5ms per release. The metrics.elapsed_ms field on every CheckResult lets callers track this in production.
░▒▓█ Limits / non-goals
- No semantic drift via embeddings. Intentional — keeps zero AI deps. Use the v0.3 classifier hook with Llama Guard / Lakera if you need it.
- No async
check()in v0.1. Sync only. v0.3 addscheckAsyncfor HTTP / in-process classifiers. - No PII / prompt-injection detection in v0.1. Bundled deterministic packs ship in v0.2.
- No dialog flow / conversational rails. Different problem — use NeMo Colang.
- No LLM-as-judge evals. Use Promptfoo, Braintrust.
- Arbiter never generates or rewrites output. It judges. Forever.
░▒▓█ Roadmap
- v0.1 — multi-step loop guards, tool-call validation, cost cap, loop detection (current)
- v0.2 — bundled deterministic safety packs (PII, prompt-injection, secrets) — regex/heuristic, no AI dep
- v0.3 — classifier plug-in hook (HTTP + in-process) — bring your own Llama Guard / Lakera / OpenAI moderation / Presidio
- v0.4+ —
arbiter replay <log.jsonl> --config new.json, per-tenant quotas, OTel emission
Full plan: docs/tasks.md.
░▒▓█ Documentation
| Doc | What's in it |
|----------------------------------------------------|---------------------------------------------------------------|
| docs/description.md | Positioning, voice, target users, elevator pitch |
| docs/tasks.md | Build sequence, status, post-launch roadmap |
| schema/config.schema.json | Ajv-validated config schema (source of truth) |
| examples/agent.js | Runnable demo loop hitting four distinct abort codes |
| examples/config.json | Realistic reference config |
░▒▓█ License
MIT — © TinyDarkForge
╔═══╗
║ ⊗ ║ "JUDGE. BUDGET. HALT."
╚═══╝