reforge-ai

v0.3.1

Published

17 days ago

Agentic orchestration and semantic enforcement for structured LLM output, with native JSON repair and deterministic retries.

Downloads

636

Reforge

Agentic orchestration and semantic enforcement for structured LLM output.

The Problem

LLMs are probabilistic and frequently output malformed JSON:

Markdown wrappers — ```json ... ``` blocks around the data
Trailing commas — {"name": "Alice",}
Unquoted keys — {name: "Alice"}
Single-quoted strings — {'name': 'Alice'}
Truncated outputs — {"items": [1, 2, 3 (hit max_tokens)
Escaped-quote anomalies — {\"key\": \"value\"}

Network retries to providers (OpenAI, Anthropic, etc.) cost 5000ms+ and real money. Most of these failures are trivially fixable.

The Solution

reforge-ai is a zero-dependency TypeScript library that sits between the LLM output and your application:

Natively repairs syntactic JSON errors in sub-5ms local timings
Validates against your Zod schema with automatic type coercion
Optionally clamps semantic violations locally (too_small, too_big, enum drift) before network retries
Generates token-efficient retry prompts when repair isn't enough
Orchestrates tools, failovers, and retries with deterministic guardrails
Works everywhere: Node.js, Bun, Deno, Cloudflare Workers, Vercel Edge, Browsers

New Capability Highlights

Line-aware retry prompts: only relevant error lines are included (including multi-line contexts), reducing noisy retries.
Prompt customization: plug in your own retry prompt strategy with retryPromptStrategy.
Profiles + toggles: choose safe, standard, or aggressive guard profiles and override individual heuristics.
Built-in redaction: redact sensitive paths/patterns from retry contexts.
Debug artifacts: inspect extracted/repaired JSON and applied repair passes when needed.
Advanced forge orchestration: retry policies, deterministic tool loops, structured lifecycle events, and forgeWithFallback() provider failover.
Universal message schema: multi-modal content blocks + normalized tool history across OpenAI-compatible, Anthropic, and Gemini adapters.
Stream-safe output hooks: use onChunk for UI output while suppressing internal tool JSON chatter.

Timing Snapshot (Measured)

From timing-summary-2026-03-13.md:

guard() samples: 11
Min: 0.1442ms
Max: 2.5365ms
Average: 0.5536ms
Under 5ms: 11/11

Installation

npm install reforge-ai zod

To use provider adapters with forge(), also install the provider SDK:

# OpenAI / OpenRouter / Groq / Together / Ollama / etc.
npm install reforge-ai zod openai

# Anthropic
npm install reforge-ai zod @anthropic-ai/sdk

# Google Gemini
npm install reforge-ai zod @google/generative-ai

zod is a required peer dependency. Provider SDKs are optional peer dependencies — only install what you use.

Quick Start

import { z } from "zod";
import { guard } from "reforge-ai";

const UserSchema = z.object({
  name: z.string(),
  age: z.number(),
});

// Raw LLM output — markdown-wrapped with a trailing comma:
const raw = '```json\n{"name": "Alice", "age": 30,}\n```';

const result = guard(raw, UserSchema);

if (result.success) {
  console.log(result.data);       // { name: "Alice", age: 30 }
  console.log(result.isRepaired); // true
  console.log(result.telemetry);  // { durationMs: 0.55, status: "repaired_natively" }
} else {
  // Append result.retryPrompt to your LLM message array
  console.log(result.retryPrompt);
  console.log(result.errors);     // ZodIssue[]
}

guard() with Line-Aware Retry Context

const result = guard(raw, UserSchema, {
  profile: "standard",
  retryPrompt: {
    mode: "line-aware",
    contextRadius: 1,
    maxContextChars: 700,
    redactPaths: ["/user/ssn"],
  },
  debug: true,
});

if (!result.success) {
  console.log(result.retryPrompt); // includes only relevant lines in line-aware mode
  console.log(result.debug?.retryContextBlocks);
}

End-to-End with forge()

forge() wraps the entire flow: call your LLM → repair → validate → auto-retry.

import { z } from "zod";
import { forge } from "reforge-ai";
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";

const provider = openaiCompatible(new OpenAI(), "gpt-4o");

const Colors = z.array(
  z.object({
    name: z.string(),
    hex: z.string(),
  })
);

const result = await forge(
  provider,
  [{ role: "user", content: "List 3 colors with hex codes." }],
  Colors
);

if (result.success) {
  console.log(result.data);
  // → [{ name: "Red", hex: "#FF0000" }, ...]
  console.log(result.telemetry);
  // → { durationMs: 0.55, status: "repaired_natively", attempts: 1, totalDurationMs: 6132 }
}

Semantic Clamp (Local, No Network Retry)

import { z } from "zod";
import { guard } from "reforge-ai";

const Schema = z.object({
  age: z.number().min(0).max(100),
  tier: z.enum(["free", "pro", "enterprise"]),
});

const raw = JSON.stringify({ age: 154, tier: "vip" });

const result = guard(raw, Schema, {
  semanticResolution: { mode: "clamp" },
});

if (result.success) {
  console.log(result.data);     // { age: 100, tier: "free" }
  console.log(result.telemetry); // status: "coerced_locally", coercedPaths: ["/age", "/tier"]
}

Tool Loops + Stream Output

import { z } from "zod";
import { forge } from "reforge-ai";

const result = await forge(provider, messages, schema, {
  tools: {
    lookupCustomer: {
      description: "Lookup customer by id",
      schema: z.object({ id: z.string() }),
      execute: async ({ id }) => ({ id, plan: "enterprise" }),
    },
  },
  toolTimeoutMs: 4000,
  maxAgentIterations: 5,
  onChunk: (text) => uiStream.append(text),
});

forge() Advanced Controls

const result = await forge(provider, messages, Colors, {
  retryPolicy: {
    maxRetries: 4,
    shouldRetry: (failure, attempt) => attempt < 3 && failure.errors.length > 0,
    mutateProviderOptions: (attempt, base) => ({
      ...base,
      temperature: attempt === 1 ? 0.6 : 0.2,
    }),
  },
  onEvent: (event) => {
    if (event.kind === "retry_scheduled") {
      console.log(`Retrying: ${event.attempt} -> ${event.nextAttempt}`);
    }
  },
  guardOptions: {
    retryPrompt: { mode: "line-aware" },
  },
});

Provider Fallback Chain

import { forgeWithFallback } from "reforge-ai";

const result = await forgeWithFallback(
  [
    { provider: openaiCompatible(openaiClient, "gpt-4o"), maxAttempts: 2 },
    { provider: anthropic(anthropicClient, "claude-sonnet-4-20250514"), maxAttempts: 1 },
  ],
  messages,
  schema,
  {
    onProviderFallback: (from, to) => {
      console.log(`Fallback provider: ${from} -> ${to}`);
    },
  },
);

Provider Adapters

| Adapter | Import | Covers | |---|---|---| | openaiCompatible() | reforge-ai/openai-compatible | OpenAI, OpenRouter, Groq, Together, Fireworks, Ollama, LM Studio, vLLM | | anthropic() | reforge-ai/anthropic | Anthropic Claude | | google() | reforge-ai/google | Google Gemini, Vertex AI |

// OpenRouter — same adapter, different baseURL
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});
const provider = openaiCompatible(client, "anthropic/claude-sonnet-4-20250514");

// Anthropic
import { anthropic } from "reforge-ai/anthropic";
import Anthropic from "@anthropic-ai/sdk";

const provider = anthropic(new Anthropic(), "claude-sonnet-4-20250514");

// Google Gemini
import { google } from "reforge-ai/google";
import { GoogleGenerativeAI } from "@google/generative-ai";

const provider = google(
  new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!),
  "gemini-2.0-flash"
);

// Custom provider — implement a single method
import { forge, type ReforgeProvider } from "reforge-ai";

const myProvider: ReforgeProvider = {
  async call(messages, options) {
    const res = await fetch("https://my-llm-api.com/chat", {
      method: "POST",
      body: JSON.stringify({ messages, ...options }),
    });
    const data = await res.json();
    return data.text;
  },
};

How It Works

guard() runs a deterministic three-stage pipeline:

1. Dirty Parse (Native Repair)

The parser runs a sequence of heuristic passes to fix common LLM output issues:

| Issue | Before | After | |---|---|---| | Markdown fences | ```json\n{"a":1}\n``` | {"a":1} | | Conversational wrapping | Here's the data: {"a":1} | {"a":1} | | Trailing commas | {"a": 1,} | {"a": 1} | | Unquoted keys | {name: "Alice"} | {"name": "Alice"} | | Single quotes | {'key': 'val'} | {"key": "val"} | | Escaped quotes | {\"key\": \"val\"} | {"key": "val"} | | Truncated output | {"items": [1, 2 | {"items": [1, 2]} |

2. Schema Validation (with Coercion)

After parsing, the data is validated against your Zod schema. Reforge also attempts automatic coercion for common LLM type mismatches:

| LLM Output | Schema Expects | Coerced To | |---|---|---| | "true" / "false" | boolean | true / false | | "42", "3.14" | number | 42, 3.14 | | "null" | nullable | null |

3. Retry Prompt Generation

If validation fails, Reforge generates a token-efficient prompt you can append to your LLM conversation:

Your previous response failed schema validation. Errors: [Path: /user/age, Expected: number, Received: string]. The schema is still in your context — return ONLY corrected valid JSON.

When the LLM returns something that can't be parsed as JSON at all, the raw text is echoed back if it's short enough to be the full picture (\u2264300 chars). For longer outputs, the snippet is omitted — a truncated fragment of the beginning shows nothing useful:

// Short output — full text echoed:
Your previous response could not be parsed as JSON. Got: `{name: Alice age: 30}`. The schema is still in your context — return ONLY valid JSON.

// Long output — snippet omitted:
Your previous response could not be parsed as JSON. The schema is still in your context — return ONLY valid JSON.

No network requests. No retries. Just a string you feed back.

API Reference

`guard<T>(llmOutput: string, schema: T, options?: GuardOptions): GuardResult<z.infer<T>>`

The main entry-point. Parses, repairs, validates, and returns a typed result.

Parameters:

| Name | Type | Description | |---|---|---| | llmOutput | string | The raw string produced by an LLM | | schema | ZodTypeAny | The Zod schema the output must conform to | | options | GuardOptions | Optional profile/toggle config, line-aware retry mode, redaction, custom prompt strategy, and debug artifacts |

Returns: GuardResult<T> — a discriminated union:

// Success
{
  success: true;
  data: T;                    // Validated & typed data
  telemetry: TelemetryData;   // { durationMs, status }
  isRepaired: boolean;        // true if the Dirty Parser fixed the input
}

// Failure
{
  success: false;
  retryPrompt: string;        // Token-efficient correction prompt
  errors: ZodIssue[];         // Zod validation issues
  telemetry: TelemetryData;   // { durationMs, status: "failed" }
}

Types

type TelemetryData = {
  durationMs: number;
  status: "clean" | "repaired_natively" | "coerced_locally" | "failed";
  coercedPaths?: string[];
};

`forge<T>(provider, messages, schema, options?): Promise<ForgeResult<z.infer<T>>>`

End-to-end structured output: call LLM → guard() → auto-retry.

Parameters:

| Name | Type | Description | |---|---|---| | provider | ReforgeProvider | An adapter wrapping your LLM SDK | | messages | Message[] | Conversation messages to send | | schema | ZodTypeAny | The Zod schema the output must conform to | | options | ForgeOptions | Optional: maxRetries, retryPolicy, providerOptions, guardOptions, onRetry, onEvent |

onRetry is called after each failed attempt that will be retried:

onRetry?: (attempt, failure) => {
  // attempt is 1-based
  // failure.errors are the zod issues from the failed guard() call
  // failure.retryPrompt is the corrective prompt used for the next attempt
}

Returns: Promise<ForgeResult<T>>:

// Success
{
  success: true;
  data: T;
  telemetry: ForgeTelemetry;
  isRepaired: boolean;
}

// Failure
{
  success: false;
  errors: ZodIssue[];
  retryPrompt: string;
  telemetry: ForgeTelemetry;
}

// ForgeTelemetry extends TelemetryData
interface ForgeTelemetry extends TelemetryData {
  attempts: number;        // Total LLM calls made
  totalDurationMs: number; // Wall-clock time for entire forge() call
  networkDurationMs: number;
  toolExecutionDurationMs: number;
  providerHops: Array<{
    providerId: string;
    attempt: number;
    succeeded: boolean;
    durationMs: number;
  }>;
  attemptDetails: Array<{
    attempt: number;
    durationMs: number;
    status: "clean" | "repaired_natively" | "coerced_locally" | "failed";
  }>;
}

Message Model (v0.3+)

type Message = {
  role: "system" | "user" | "assistant" | "tool";
  content:
    | string
    | Array<
        | { type: "text"; text: string }
        | { type: "image_url"; image_url: { url: string; detail?: "auto" | "low" | "high" } }
      >;
  toolCalls?: Array<{ id: string; name: string; arguments: string }>;
  toolResponse?: {
    toolCallId: string;
    name: string;
    content: string | Array<{ type: "text"; text: string }>;
    isError?: boolean;
  };
};

Examples

OpenAI with forge()

import { z } from "zod";
import { forge } from "reforge-ai";
import { openaiCompatible } from "reforge-ai/openai-compatible";
import OpenAI from "openai";

const provider = openaiCompatible(new OpenAI(), "gpt-4o");

const RecipeSchema = z.object({
  title: z.string(),
  ingredients: z.array(z.string()),
  steps: z.array(z.string()),
});

const result = await forge(
  provider,
  [
    { role: "system", content: "Return JSON only." },
    { role: "user", content: "Give me a recipe for chocolate cake." },
  ],
  RecipeSchema,
  { maxRetries: 3, providerOptions: { temperature: 0.2 } }
);

if (result.success) {
  console.log(`Resolved in ${result.telemetry.attempts} attempt(s)`);
  console.log(result.data);
}

Anthropic with forge()

import { z } from "zod";
import { forge } from "reforge-ai";
import { anthropic } from "reforge-ai/anthropic";
import Anthropic from "@anthropic-ai/sdk";

const provider = anthropic(new Anthropic(), "claude-sonnet-4-20250514");

const SummarySchema = z.object({
  title: z.string(),
  summary: z.string(),
  tags: z.array(z.string()),
});

const result = await forge(
  provider,
  [{ role: "user", content: "Summarize: TypeScript 5.7 adds ..." }],
  SummarySchema
);

guard() Only (Manual Retry)

import OpenAI from "openai";
import { z } from "zod";
import { guard } from "reforge-ai";

const client = new OpenAI();

const ProductSchema = z.object({
  name: z.string(),
  price: z.number(),
  tags: z.array(z.string()),
});

async function getProduct(prompt: string) {
  const messages: OpenAI.ChatCompletionMessageParam[] = [
    { role: "user", content: prompt },
  ];

  for (let attempt = 0; attempt < 3; attempt++) {
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages,
    });

    const raw = response.choices[0]?.message?.content ?? "";
    const result = guard(raw, ProductSchema);

    if (result.success) return result.data;

    messages.push({ role: "assistant", content: raw });
    messages.push({ role: "user", content: result.retryPrompt });
  }

  throw new Error("Failed after 3 attempts");
}

Edge Runtime (Next.js API Route)

// app/api/parse/route.ts
import { z } from "zod";
import { guard } from "reforge-ai";

export const runtime = "edge";

const PayloadSchema = z.object({
  action: z.string(),
  data: z.record(z.unknown()),
});

export async function POST(request: Request) {
  const body = await request.text();
  const result = guard(body, PayloadSchema);

  if (result.success) {
    return Response.json({ ok: true, data: result.data });
  }

  return Response.json(
    { ok: false, errors: result.errors },
    { status: 422 },
  );
}

Performance

Reforge is designed for < 5ms end-to-end on a 2KB input. The entire pipeline is:

Synchronous — no async, no network, no I/O
Pure — no global state mutation
O(n) — linear time relative to input length
Never throws — all error paths return typed result objects

Guarantees

Zero dependencies in core — zod is a required peer dependency
Environment agnostic — no Node-specific APIs (fs, path, Buffer)
Tree-shakeable — ESM + CJS dual output via tsup
Strict TypeScript — full type safety with discriminated union results

Environment Compatibility

| Runtime | Status | Notes | |----------------------|-------------|----------------| | Node.js 16+ | ✅ Supported | CJS + ESM | | Bun | ✅ Supported | Native ESM | | Deno | ✅ Supported | Via npm: specifier | | Cloudflare Workers | ✅ Supported | No Node APIs | | Vercel Edge | ✅ Supported | Edge-compatible | | Browser | ✅ Supported | ESM, tree-shakeable |

Documentation

Full documentation, interactive demo, and integration guides are available at reforge-ai-97558.web.app.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for:

Setup instructions
Code standards
PR workflow
Test guidelines

Reporting Issues

Bug reports: Open an issue with the raw input, schema, and unexpected result.
Feature requests: Open an issue with the use case and proposed API.

Changelog

See CHANGELOG.md for a detailed history of changes.

License

GNU GPL v3