reforge-ai
v0.3.1
Published
Agentic orchestration and semantic enforcement for structured LLM output, with native JSON repair and deterministic retries.
Downloads
636
Maintainers
Readme
Reforge
Agentic orchestration and semantic enforcement for structured LLM output.
The Problem
LLMs are probabilistic and frequently output malformed JSON:
- Markdown wrappers —
```json ... ```blocks around the data - Trailing commas —
{"name": "Alice",} - Unquoted keys —
{name: "Alice"} - Single-quoted strings —
{'name': 'Alice'} - Truncated outputs —
{"items": [1, 2, 3(hitmax_tokens) - Escaped-quote anomalies —
{\"key\": \"value\"}
Network retries to providers (OpenAI, Anthropic, etc.) cost 5000ms+ and real money. Most of these failures are trivially fixable.
The Solution
reforge-ai is a zero-dependency TypeScript library that sits between the LLM output and your application:
- Natively repairs syntactic JSON errors in sub-5ms local timings
- Validates against your Zod schema with automatic type coercion
- Optionally clamps semantic violations locally (
too_small,too_big, enum drift) before network retries - Generates token-efficient retry prompts when repair isn't enough
- Orchestrates tools, failovers, and retries with deterministic guardrails
- Works everywhere: Node.js, Bun, Deno, Cloudflare Workers, Vercel Edge, Browsers
New Capability Highlights
- Line-aware retry prompts: only relevant error lines are included (including multi-line contexts), reducing noisy retries.
- Prompt customization: plug in your own retry prompt strategy with
retryPromptStrategy. - Profiles + toggles: choose
safe,standard, oraggressiveguard profiles and override individual heuristics. - Built-in redaction: redact sensitive paths/patterns from retry contexts.
- Debug artifacts: inspect extracted/repaired JSON and applied repair passes when needed.
- Advanced forge orchestration: retry policies, deterministic tool loops, structured lifecycle events, and
forgeWithFallback()provider failover. - Universal message schema: multi-modal content blocks + normalized tool history across OpenAI-compatible, Anthropic, and Gemini adapters.
- Stream-safe output hooks: use
onChunkfor UI output while suppressing internal tool JSON chatter.
Timing Snapshot (Measured)
From timing-summary-2026-03-13.md:
guard()samples:11- Min:
0.1442ms - Max:
2.5365ms - Average:
0.5536ms - Under
5ms:11/11
Installation
npm install reforge-ai zodTo use provider adapters with forge(), also install the provider SDK:
# OpenAI / OpenRouter / Groq / Together / Ollama / etc.
npm install reforge-ai zod openai
# Anthropic
npm install reforge-ai zod @anthropic-ai/sdk
# Google Gemini
npm install reforge-ai zod @google/generative-ai
zodis a required peer dependency. Provider SDKs are optional peer dependencies — only install what you use.
Quick Start
import { z } from "zod";
import { guard } from "reforge-ai";
const UserSchema = z.object({
name: z.string(),
age: z.number(),
});
// Raw LLM output — markdown-wrapped with a trailing comma:
const raw = '```json\n{"name": "Alice", "age": 30,}\n```';
const result = guard(raw, UserSchema);
if (result.success) {
console.log(result.data); // { name: "Alice", age: 30 }
console.log(result.isRepaired); // true
console.log(result.telemetry); // { durationMs: 0.55, status: "repaired_natively" }
} else {
// Append result.retryPrompt to your LLM message array
console.log(result.retryPrompt);
console.log(result.errors); // ZodIssue[]
}guard() with Line-Aware Retry Context
const result = guard(raw, UserSchema, {
profile: "standard",
retryPrompt: {
mode: "line-aware",
contextRadius: 1,
maxContextChars: 700,
redactPaths: ["/user/ssn"],
},
debug: true,
});
if (!result.success) {
console.log(result.retryPrompt); // includes only relevant lines in line-aware mode
console.log(result.debug?.retryContextBlocks);
}End-to-End with forge()
forge() wraps the entire flow: call your LLM → repair → validate → auto-retry.
import { z } from "zod";
import { forge } from "reforge-ai";
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";
const provider = openaiCompatible(new OpenAI(), "gpt-4o");
const Colors = z.array(
z.object({
name: z.string(),
hex: z.string(),
})
);
const result = await forge(
provider,
[{ role: "user", content: "List 3 colors with hex codes." }],
Colors
);
if (result.success) {
console.log(result.data);
// → [{ name: "Red", hex: "#FF0000" }, ...]
console.log(result.telemetry);
// → { durationMs: 0.55, status: "repaired_natively", attempts: 1, totalDurationMs: 6132 }
}Semantic Clamp (Local, No Network Retry)
import { z } from "zod";
import { guard } from "reforge-ai";
const Schema = z.object({
age: z.number().min(0).max(100),
tier: z.enum(["free", "pro", "enterprise"]),
});
const raw = JSON.stringify({ age: 154, tier: "vip" });
const result = guard(raw, Schema, {
semanticResolution: { mode: "clamp" },
});
if (result.success) {
console.log(result.data); // { age: 100, tier: "free" }
console.log(result.telemetry); // status: "coerced_locally", coercedPaths: ["/age", "/tier"]
}Tool Loops + Stream Output
import { z } from "zod";
import { forge } from "reforge-ai";
const result = await forge(provider, messages, schema, {
tools: {
lookupCustomer: {
description: "Lookup customer by id",
schema: z.object({ id: z.string() }),
execute: async ({ id }) => ({ id, plan: "enterprise" }),
},
},
toolTimeoutMs: 4000,
maxAgentIterations: 5,
onChunk: (text) => uiStream.append(text),
});forge() Advanced Controls
const result = await forge(provider, messages, Colors, {
retryPolicy: {
maxRetries: 4,
shouldRetry: (failure, attempt) => attempt < 3 && failure.errors.length > 0,
mutateProviderOptions: (attempt, base) => ({
...base,
temperature: attempt === 1 ? 0.6 : 0.2,
}),
},
onEvent: (event) => {
if (event.kind === "retry_scheduled") {
console.log(`Retrying: ${event.attempt} -> ${event.nextAttempt}`);
}
},
guardOptions: {
retryPrompt: { mode: "line-aware" },
},
});Provider Fallback Chain
import { forgeWithFallback } from "reforge-ai";
const result = await forgeWithFallback(
[
{ provider: openaiCompatible(openaiClient, "gpt-4o"), maxAttempts: 2 },
{ provider: anthropic(anthropicClient, "claude-sonnet-4-20250514"), maxAttempts: 1 },
],
messages,
schema,
{
onProviderFallback: (from, to) => {
console.log(`Fallback provider: ${from} -> ${to}`);
},
},
);Provider Adapters
| Adapter | Import | Covers |
|---|---|---|
| openaiCompatible() | reforge-ai/openai-compatible | OpenAI, OpenRouter, Groq, Together, Fireworks, Ollama, LM Studio, vLLM |
| anthropic() | reforge-ai/anthropic | Anthropic Claude |
| google() | reforge-ai/google | Google Gemini, Vertex AI |
// OpenRouter — same adapter, different baseURL
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const provider = openaiCompatible(client, "anthropic/claude-sonnet-4-20250514");// Anthropic
import { anthropic } from "reforge-ai/anthropic";
import Anthropic from "@anthropic-ai/sdk";
const provider = anthropic(new Anthropic(), "claude-sonnet-4-20250514");// Google Gemini
import { google } from "reforge-ai/google";
import { GoogleGenerativeAI } from "@google/generative-ai";
const provider = google(
new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!),
"gemini-2.0-flash"
);// Custom provider — implement a single method
import { forge, type ReforgeProvider } from "reforge-ai";
const myProvider: ReforgeProvider = {
async call(messages, options) {
const res = await fetch("https://my-llm-api.com/chat", {
method: "POST",
body: JSON.stringify({ messages, ...options }),
});
const data = await res.json();
return data.text;
},
};How It Works
guard() runs a deterministic three-stage pipeline:
1. Dirty Parse (Native Repair)
The parser runs a sequence of heuristic passes to fix common LLM output issues:
| Issue | Before | After |
|---|---|---|
| Markdown fences | ```json\n{"a":1}\n``` | {"a":1} |
| Conversational wrapping | Here's the data: {"a":1} | {"a":1} |
| Trailing commas | {"a": 1,} | {"a": 1} |
| Unquoted keys | {name: "Alice"} | {"name": "Alice"} |
| Single quotes | {'key': 'val'} | {"key": "val"} |
| Escaped quotes | {\"key\": \"val\"} | {"key": "val"} |
| Truncated output | {"items": [1, 2 | {"items": [1, 2]} |
2. Schema Validation (with Coercion)
After parsing, the data is validated against your Zod schema. Reforge also attempts automatic coercion for common LLM type mismatches:
| LLM Output | Schema Expects | Coerced To |
|---|---|---|
| "true" / "false" | boolean | true / false |
| "42", "3.14" | number | 42, 3.14 |
| "null" | nullable | null |
3. Retry Prompt Generation
If validation fails, Reforge generates a token-efficient prompt you can append to your LLM conversation:
Your previous response failed schema validation. Errors: [Path: /user/age, Expected: number, Received: string]. The schema is still in your context — return ONLY corrected valid JSON.When the LLM returns something that can't be parsed as JSON at all, the raw text is echoed back if it's short enough to be the full picture (\u2264300 chars). For longer outputs, the snippet is omitted — a truncated fragment of the beginning shows nothing useful:
// Short output — full text echoed:
Your previous response could not be parsed as JSON. Got: `{name: Alice age: 30}`. The schema is still in your context — return ONLY valid JSON.
// Long output — snippet omitted:
Your previous response could not be parsed as JSON. The schema is still in your context — return ONLY valid JSON.No network requests. No retries. Just a string you feed back.
API Reference
guard<T>(llmOutput: string, schema: T, options?: GuardOptions): GuardResult<z.infer<T>>
The main entry-point. Parses, repairs, validates, and returns a typed result.
Parameters:
| Name | Type | Description |
|---|---|---|
| llmOutput | string | The raw string produced by an LLM |
| schema | ZodTypeAny | The Zod schema the output must conform to |
| options | GuardOptions | Optional profile/toggle config, line-aware retry mode, redaction, custom prompt strategy, and debug artifacts |
Returns: GuardResult<T> — a discriminated union:
// Success
{
success: true;
data: T; // Validated & typed data
telemetry: TelemetryData; // { durationMs, status }
isRepaired: boolean; // true if the Dirty Parser fixed the input
}
// Failure
{
success: false;
retryPrompt: string; // Token-efficient correction prompt
errors: ZodIssue[]; // Zod validation issues
telemetry: TelemetryData; // { durationMs, status: "failed" }
}Types
type TelemetryData = {
durationMs: number;
status: "clean" | "repaired_natively" | "coerced_locally" | "failed";
coercedPaths?: string[];
};forge<T>(provider, messages, schema, options?): Promise<ForgeResult<z.infer<T>>>
End-to-end structured output: call LLM → guard() → auto-retry.
Parameters:
| Name | Type | Description |
|---|---|---|
| provider | ReforgeProvider | An adapter wrapping your LLM SDK |
| messages | Message[] | Conversation messages to send |
| schema | ZodTypeAny | The Zod schema the output must conform to |
| options | ForgeOptions | Optional: maxRetries, retryPolicy, providerOptions, guardOptions, onRetry, onEvent |
onRetry is called after each failed attempt that will be retried:
onRetry?: (attempt, failure) => {
// attempt is 1-based
// failure.errors are the zod issues from the failed guard() call
// failure.retryPrompt is the corrective prompt used for the next attempt
}Returns: Promise<ForgeResult<T>>:
// Success
{
success: true;
data: T;
telemetry: ForgeTelemetry;
isRepaired: boolean;
}
// Failure
{
success: false;
errors: ZodIssue[];
retryPrompt: string;
telemetry: ForgeTelemetry;
}
// ForgeTelemetry extends TelemetryData
interface ForgeTelemetry extends TelemetryData {
attempts: number; // Total LLM calls made
totalDurationMs: number; // Wall-clock time for entire forge() call
networkDurationMs: number;
toolExecutionDurationMs: number;
providerHops: Array<{
providerId: string;
attempt: number;
succeeded: boolean;
durationMs: number;
}>;
attemptDetails: Array<{
attempt: number;
durationMs: number;
status: "clean" | "repaired_natively" | "coerced_locally" | "failed";
}>;
}Message Model (v0.3+)
type Message = {
role: "system" | "user" | "assistant" | "tool";
content:
| string
| Array<
| { type: "text"; text: string }
| { type: "image_url"; image_url: { url: string; detail?: "auto" | "low" | "high" } }
>;
toolCalls?: Array<{ id: string; name: string; arguments: string }>;
toolResponse?: {
toolCallId: string;
name: string;
content: string | Array<{ type: "text"; text: string }>;
isError?: boolean;
};
};Examples
OpenAI with forge()
import { z } from "zod";
import { forge } from "reforge-ai";
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";
const provider = openaiCompatible(new OpenAI(), "gpt-4o");
const RecipeSchema = z.object({
title: z.string(),
ingredients: z.array(z.string()),
steps: z.array(z.string()),
});
const result = await forge(
provider,
[
{ role: "system", content: "Return JSON only." },
{ role: "user", content: "Give me a recipe for chocolate cake." },
],
RecipeSchema,
{ maxRetries: 3, providerOptions: { temperature: 0.2 } }
);
if (result.success) {
console.log(`Resolved in ${result.telemetry.attempts} attempt(s)`);
console.log(result.data);
}Anthropic with forge()
import { z } from "zod";
import { forge } from "reforge-ai";
import { anthropic } from "reforge-ai/anthropic";
import Anthropic from "@anthropic-ai/sdk";
const provider = anthropic(new Anthropic(), "claude-sonnet-4-20250514");
const SummarySchema = z.object({
title: z.string(),
summary: z.string(),
tags: z.array(z.string()),
});
const result = await forge(
provider,
[{ role: "user", content: "Summarize: TypeScript 5.7 adds ..." }],
SummarySchema
);guard() Only (Manual Retry)
import OpenAI from "openai";
import { z } from "zod";
import { guard } from "reforge-ai";
const client = new OpenAI();
const ProductSchema = z.object({
name: z.string(),
price: z.number(),
tags: z.array(z.string()),
});
async function getProduct(prompt: string) {
const messages: OpenAI.ChatCompletionMessageParam[] = [
{ role: "user", content: prompt },
];
for (let attempt = 0; attempt < 3; attempt++) {
const response = await client.chat.completions.create({
model: "gpt-4o",
messages,
});
const raw = response.choices[0]?.message?.content ?? "";
const result = guard(raw, ProductSchema);
if (result.success) return result.data;
messages.push({ role: "assistant", content: raw });
messages.push({ role: "user", content: result.retryPrompt });
}
throw new Error("Failed after 3 attempts");
}Edge Runtime (Next.js API Route)
// app/api/parse/route.ts
import { z } from "zod";
import { guard } from "reforge-ai";
export const runtime = "edge";
const PayloadSchema = z.object({
action: z.string(),
data: z.record(z.unknown()),
});
export async function POST(request: Request) {
const body = await request.text();
const result = guard(body, PayloadSchema);
if (result.success) {
return Response.json({ ok: true, data: result.data });
}
return Response.json(
{ ok: false, errors: result.errors },
{ status: 422 },
);
}Performance
Reforge is designed for < 5ms end-to-end on a 2KB input. The entire pipeline is:
- Synchronous — no async, no network, no I/O
- Pure — no global state mutation
- O(n) — linear time relative to input length
- Never throws — all error paths return typed result objects
Guarantees
- Zero dependencies in core —
zodis a required peer dependency - Environment agnostic — no Node-specific APIs (
fs,path,Buffer) - Tree-shakeable — ESM + CJS dual output via tsup
- Strict TypeScript — full type safety with discriminated union results
Environment Compatibility
| Runtime | Status | Notes | |----------------------|-------------|----------------| | Node.js 16+ | ✅ Supported | CJS + ESM | | Bun | ✅ Supported | Native ESM | | Deno | ✅ Supported | Via npm: specifier | | Cloudflare Workers | ✅ Supported | No Node APIs | | Vercel Edge | ✅ Supported | Edge-compatible | | Browser | ✅ Supported | ESM, tree-shakeable |
Documentation
Full documentation, interactive demo, and integration guides are available at reforge-ai-97558.web.app.
Contributing
We welcome contributions! Please see CONTRIBUTING.md for:
- Setup instructions
- Code standards
- PR workflow
- Test guidelines
Reporting Issues
- Bug reports: Open an issue with the raw input, schema, and unexpected result.
- Feature requests: Open an issue with the use case and proposed API.
Changelog
See CHANGELOG.md for a detailed history of changes.
License
GNU GPL v3
