llm-retry

v0.1.2

Published

4 months ago

Smart retry orchestrator for LLM output parsing and validation

0High
0Medium
0Low

llm-retry

Smart retry orchestrator for LLM output parsing and validation. Automatically repairs malformed output, validates against a schema, and re-prompts the model with structured error feedback until the response is valid -- or retries are exhausted.

Why llm-retry

Generic retry libraries (p-retry, async-retry) re-execute the same function after a delay. That works for transient network errors, but not for structural LLM output failures. When a model returns malformed JSON or an object missing required fields, blindly retrying the same prompt produces the same mistake. Fixing this requires feeding the validation errors back to the model as part of the next prompt, giving it specific information about what was wrong and how to correct it.

llm-retry provides exactly this loop: call the LLM, attempt local repair on the raw output, validate against a schema, and if validation fails, format the errors into a feedback message, append it to the conversation, and re-call the LLM. The package is provider-agnostic -- it accepts any async function that takes messages and returns a string. Zero external runtime dependencies.

Installation

npm install llm-retry

Requires Node.js >= 18.

Quick Start

import { retryWithValidation, heuristicValidator } from 'llm-retry';

// Wrap your LLM SDK call -- any provider works
const callLLM = async (messages, context) => {
  const response = await openai.chat.completions.create({
    model: context?.model ?? 'gpt-4o-mini',
    temperature: context?.temperature ?? 0.2,
    messages,
  });
  return response.choices[0].message.content;
};

const validator = heuristicValidator({
  type: 'object',
  required: ['name', 'age'],
});

const result = await retryWithValidation(callLLM, validator, {
  messages: [{ role: 'user', content: 'Return a JSON object with name and age.' }],
  maxRetries: 3,
});

if (result.success) {
  console.log(result.data); // { name: "Alice", age: 30 }
} else {
  console.error(result.error);
  console.error(`Failed after ${result.attempts.length} attempts`);
}

Features

Provider-agnostic -- Works with OpenAI, Anthropic, Google Gemini, Mistral, Ollama, llama.cpp, vLLM, or any custom inference server. You provide the LLM call function; llm-retry provides the retry loop.
Automatic output repair -- Strips markdown code fences, removes trailing commas, fixes unclosed strings, and extracts JSON from surrounding prose before validation. Repair is instant and free, avoiding unnecessary retries.
Structured error feedback -- When validation fails, errors are formatted into a feedback message appended to the conversation. The model receives targeted correction information instead of a blind retry.
Configurable repair levels -- Four levels from none (raw passthrough) to aggressive (balanced JSON block extraction). Custom repair functions supported.
Configurable feedback strategies -- Control whether the full previous response, a truncated version, only the errors, or nothing is sent back to the model.
Reusable retrier factory -- Pre-configure a retrier with createRetrier and reuse it across multiple calls with different messages.
Attempt-level observability -- Every attempt is recorded with raw output, repaired output, validation errors, duration, and success status. The onAttempt callback fires after each attempt.
Full TypeScript support -- Strict types for all exports. Generic RetryResult<T> and Retrier<T> types propagate the validated data type.
Zero runtime dependencies -- Ships with no external dependencies. The built-in heuristicValidator covers type checks and required-field validation without requiring Zod or Ajv.

API Reference

retryWithValidation

The primary function. Calls the LLM, repairs the output, validates it, and retries with feedback on failure.

function retryWithValidation<T>(
  callLLM: CallLLMFunction,
  validator: ValidatorFunction<T>,
  options?: RetryOptions & { messages: Message[] }
): Promise<RetryResult<T>>

Parameters:

| Parameter | Type | Description | |-------------|-------------------------|-------------| | callLLM | CallLLMFunction | Async function that sends messages to an LLM and returns the raw string response. Receives (messages, context). | | validator | ValidatorFunction<T> | Function that validates parsed data and returns a ValidationResult<T>. Use heuristicValidator() or provide a custom implementation. | | options | RetryOptions & { messages } | Configuration object including the initial messages array. See RetryOptions below. |

Returns: Promise<RetryResult<T>> -- see RetryResult.

Example:

const result = await retryWithValidation(callLLM, validator, {
  messages: [{ role: 'user', content: 'Extract the person as JSON.' }],
  maxRetries: 5,
  repair: { level: 'aggressive' },
  feedbackStrategy: 'full',
  temperature: 0.3,
  model: 'gpt-4o',
  onAttempt: (record) => {
    console.log(`Attempt ${record.attempt}: ${record.success ? 'ok' : 'failed'}`);
  },
});

createRetrier

Factory function that returns a pre-configured Retrier<T> instance. Useful when the same validator and options are reused across many calls.

function createRetrier<T>(
  config: RetrierConfig & { validator: ValidatorFunction<T> }
): Retrier<T>

Parameters:

| Parameter | Type | Description | |-----------|------|-------------| | config | RetrierConfig & { validator } | Configuration including a validator function and all retry options. |

Returns: A Retrier<T> object with a retry method and a config property.

Example:

import { createRetrier, heuristicValidator } from 'llm-retry';

const retrier = createRetrier({
  validator: heuristicValidator({ type: 'object', required: ['id', 'title'] }),
  maxRetries: 5,
  repair: { level: 'aggressive' },
  feedbackStrategy: 'truncated',
});

// Reuse across multiple calls
const result1 = await retrier.retry(callLLM, [
  { role: 'user', content: 'Extract article 1 as JSON.' },
]);
const result2 = await retrier.retry(callLLM, [
  { role: 'user', content: 'Extract article 2 as JSON.' },
]);

// Inspect config
console.log(retrier.config.maxRetries); // 5

heuristicValidator

Built-in, zero-dependency validator. Checks type and required fields against a simple schema object.

function heuristicValidator(
  schema?: Record<string, unknown>
): ValidatorFunction<unknown>

Behavior by schema:

| Schema provided | Validation behavior | |-----------------|---------------------| | No schema | Verifies the output is valid parseable JSON. Non-JSON strings are rejected. Parsed objects and arrays pass. | | { type } | Checks that typeof the parsed value (or 'array' for arrays) matches the specified type string. | | { required } | Checks that all listed keys exist on the parsed object. | | { type, required } | Both checks are applied. |

Example:

import { heuristicValidator } from 'llm-retry';

// Accept any valid JSON
const anyJson = heuristicValidator();

// Require a specific shape
const personValidator = heuristicValidator({
  type: 'object',
  required: ['name', 'age', 'email'],
});

repairOutput

Applies repair transformations to raw LLM output based on the specified repair level.

function repairOutput(raw: string, level: RepairLevel): string

Parameters:

| Parameter | Type | Description | |-----------|---------------|-------------| | raw | string | The raw LLM output string. | | level | RepairLevel | One of 'none', 'minimal', 'standard', 'aggressive'. |

Returns: The repaired string.

extractCodeFence

Extracts content from a markdown code fence.

function extractCodeFence(text: string, lang?: string): string | null

Parameters:

| Parameter | Type | Description | |-----------|----------|-------------| | text | string | Input text that may contain a code fence. | | lang | string | Optional language tag to match (e.g., 'json'). If omitted, matches any fence. |

Returns: The content inside the fence, or null if no fence is found.

removeTrailingCommas

Removes trailing commas before } or ] in a JSON string.

function removeTrailingCommas(json: string): string

extractJsonBlock

Finds the first balanced {...} or [...] block in a string using bracket-matching with string-escape awareness.

function extractJsonBlock(text: string): string | null

Returns: The extracted JSON block, or null if no balanced block is found.

buildFeedbackMessage

Constructs the feedback message sent to the LLM on retry, based on the configured strategy.

function buildFeedbackMessage(
  rawOutput: string,
  errors: ValidationError[],
  strategy: 'errors-only' | 'full' | 'truncated' | 'none'
): string

Parameters:

| Parameter | Type | Description | |-------------|---------------------|-------------| | rawOutput | string | The raw LLM output from the failed attempt. | | errors | ValidationError[] | The validation errors from the failed attempt. | | strategy | Feedback strategy | Controls what context is included. See Feedback Strategies. |

Returns: The formatted feedback string. Returns an empty string for the 'none' strategy.

wrapErrors

Wraps an arbitrary caught error into a ValidationError[] array.

function wrapErrors(error: unknown): ValidationError[]

Useful for converting exceptions from callLLM into the standard error format used throughout the retry loop.

formatValidationErrors

Formats an array of ValidationError objects into a human-readable string.

function formatValidationErrors(errors: ValidationError[]): string

Example output:

Found 2 validation errors:
1. path 'name': required field missing (expected: field to exist) (received: undefined)
2. path '': expected object, got string (expected: object) (received: string)

Returns "No validation errors." for an empty array.

Configuration

RetryOptions

interface RetryOptions {
  maxRetries?: number;
  repair?: RepairConfig;
  feedbackStrategy?: 'errors-only' | 'full' | 'truncated' | 'none';
  temperature?: number;
  model?: string;
  onAttempt?: (record: AttemptRecord) => void;
  systemPrompt?: string;
}

| Option | Type | Default | Description | |--------------------|-------------------|------------------|-------------| | maxRetries | number | 3 | Maximum number of LLM call attempts. | | repair | RepairConfig | { level: 'standard' } | Controls output repair behavior. See Repair Levels. | | feedbackStrategy | string | 'errors-only' | Controls what error context is sent to the LLM on retry. See Feedback Strategies. | | temperature | number | undefined | Passed to callLLM via RetryContext. Use this to control the LLM's temperature from the retry layer. | | model | string | undefined | Passed to callLLM via RetryContext. Use this to specify or switch models from the retry layer. | | onAttempt | function | undefined | Callback fired after each attempt with the full AttemptRecord. | | systemPrompt | string | undefined | System prompt to include in the conversation. |

RepairConfig

interface RepairConfig {
  level?: RepairLevel;
  custom?: (text: string) => string;
}

| Option | Type | Default | Description | |----------|------------|--------------|-------------| | level | RepairLevel | 'standard' | Built-in repair aggressiveness. See Repair Levels. | | custom | function | undefined | A custom repair function applied after built-in repairs. Receives the repaired text and returns the further-repaired text. |

Repair Levels

Control how aggressively the raw LLM output is transformed before validation.

| Level | Operations | |--------------|------------| | none | No repair. Raw output is passed directly to validation. | | minimal | Extract content from markdown code fences (```json ... ``` or ``` ... ```). Trim whitespace. | | standard | All of minimal, plus remove trailing commas before } or ], and fix unclosed strings (append a closing " when the quote count is odd). | | aggressive | All of standard, plus extract the first balanced {...} or [...] block from the text using bracket-matching. |

Example with custom repair:

const result = await retryWithValidation(callLLM, validator, {
  messages,
  repair: {
    level: 'standard',
    custom: (text) => text.replace(/\/\/.*/g, ''), // strip JS-style line comments
  },
});

Feedback Strategies

Control what error context is appended to the conversation when retrying.

| Strategy | Behavior | |---------------|----------| | errors-only | Sends only the formatted validation errors. This is the default. | | full | Sends the full previous LLM response followed by the formatted errors. | | truncated | Sends the first 500 characters of the previous response followed by the formatted errors. | | none | No feedback is appended. The retry uses the original messages unchanged. |

When feedback is enabled, the retry loop appends two messages to the conversation: the LLM's failed response as an assistant message, and the feedback as a user message. This gives the model full context for correction on the next attempt.

Error Handling

RetryResult

Every call to retryWithValidation or retrier.retry returns a RetryResult<T>:

interface RetryResult<T> {
  success: boolean;
  data?: T;
  error?: string;
  attempts: AttemptRecord[];
  totalDurationMs: number;
  finalOutput?: string;
}

| Field | Type | Description | |------------------|-------------------|-------------| | success | boolean | true if validation passed on any attempt. | | data | T \| undefined | The validated, typed data. Present when success is true. | | error | string \| undefined | Error message when all retries are exhausted. Typically "Max retries exceeded". | | attempts | AttemptRecord[] | Full record of every attempt, in order. | | totalDurationMs| number | Sum of all attempt durations in milliseconds. | | finalOutput | string \| undefined | The final raw output from the last attempt. |

AttemptRecord

Each attempt is recorded with full diagnostic information:

interface AttemptRecord {
  attempt: number;
  rawOutput: string;
  repairedOutput?: string;
  validationErrors?: ValidationError[];
  durationMs: number;
  success: boolean;
}

| Field | Type | Description | |--------------------|-----------------------|-------------| | attempt | number | 1-based attempt number. | | rawOutput | string | The raw string returned by callLLM. Empty string if the call threw an error. | | repairedOutput | string \| undefined | The output after repair transformations, if repair was applied. | | validationErrors | ValidationError[] \| undefined | Validation errors for this attempt. undefined on success. | | durationMs | number | Wall-clock duration of this attempt in milliseconds. | | success | boolean | Whether validation passed on this attempt. |

ValidationError

interface ValidationError {
  path: string;
  message: string;
  code: string;
  expected?: unknown;
  received?: unknown;
}

| Field | Type | Description | |------------|-----------|-------------| | path | string | JSON path to the field with the error (e.g., 'name', '' for root-level errors). | | message | string | Human-readable error description. | | code | string | Machine-readable error code (e.g., 'invalid_json', 'required', 'invalid_type', 'unexpected_error'). | | expected | unknown | What was expected (optional). | | received | unknown | What was received (optional). |

Handling callLLM Exceptions

If callLLM throws an error (network failure, API error, etc.), the exception is caught, wrapped into a ValidationError with code 'unexpected_error', and recorded in the AttemptRecord. The retry loop continues to the next attempt. This means transient errors do not immediately abort the loop.

Advanced Usage

Custom Validator

Implement ValidatorFunction<T> for full control over validation logic:

import type { ValidatorFunction, ValidationResult } from 'llm-retry';
import { retryWithValidation } from 'llm-retry';

interface Person {
  name: string;
  age: number;
}

const personValidator: ValidatorFunction<Person> = (data): ValidationResult<Person> => {
  if (typeof data !== 'object' || data === null) {
    return {
      success: false,
      errors: [{ path: '', message: 'expected an object', code: 'invalid_type' }],
    };
  }

  const obj = data as Record<string, unknown>;
  const errors = [];

  if (typeof obj.name !== 'string') {
    errors.push({
      path: 'name',
      message: 'expected a string',
      code: 'invalid_type',
      expected: 'string',
      received: typeof obj.name,
    });
  }

  if (typeof obj.age !== 'number' || obj.age < 0 || obj.age > 150) {
    errors.push({
      path: 'age',
      message: 'expected a number between 0 and 150',
      code: 'invalid_type',
      expected: 'number (0-150)',
      received: String(obj.age),
    });
  }

  if (errors.length > 0) {
    return { success: false, errors };
  }

  return { success: true, data: obj as Person };
};

const result = await retryWithValidation(callLLM, personValidator, {
  messages: [{ role: 'user', content: 'Return a person object with name and age.' }],
});

if (result.success) {
  // result.data is typed as Person
  console.log(result.data.name, result.data.age);
}

Observability with onAttempt

Track every attempt for logging, metrics, or alerting:

const result = await retryWithValidation(callLLM, validator, {
  messages,
  maxRetries: 5,
  onAttempt: (record) => {
    console.log(JSON.stringify({
      attempt: record.attempt,
      success: record.success,
      durationMs: record.durationMs,
      errorCount: record.validationErrors?.length ?? 0,
    }));

    if (!record.success && record.validationErrors) {
      for (const err of record.validationErrors) {
        console.warn(`  [${err.code}] ${err.path}: ${err.message}`);
      }
    }
  },
});

Provider Adapter Patterns

llm-retry works with any LLM provider. The callLLM function receives a RetryContext as the second argument, which carries attempt, temperature, model, maxTokens, and escalated fields.

OpenAI:

import OpenAI from 'openai';

const openai = new OpenAI();

const callLLM = async (messages, context) => {
  const response = await openai.chat.completions.create({
    model: context?.model ?? 'gpt-4o-mini',
    temperature: context?.temperature ?? 0.2,
    messages,
  });
  return response.choices[0].message.content ?? '';
};

Anthropic:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const callLLM = async (messages, context) => {
  const response = await client.messages.create({
    model: context?.model ?? 'claude-sonnet-4-20250514',
    max_tokens: context?.maxTokens ?? 1024,
    messages,
  });
  return response.content[0].type === 'text' ? response.content[0].text : '';
};

Ollama:

const callLLM = async (messages, context) => {
  const response = await fetch('http://localhost:11434/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      model: context?.model ?? 'llama3',
      messages,
      stream: false,
      options: { temperature: context?.temperature ?? 0.2 },
    }),
  });
  const data = await response.json();
  return data.message.content;
};

Using Repair Utilities Standalone

The repair functions are exported individually and can be used outside the retry loop:

import { extractCodeFence, removeTrailingCommas, extractJsonBlock, repairOutput } from 'llm-retry';

// Extract from a ```json fence
const json = extractCodeFence('```json\n{"key": "value"}\n```', 'json');
// '{"key": "value"}'

// Remove trailing commas
const fixed = removeTrailingCommas('{"a": 1, "b": 2,}');
// '{"a": 1, "b": 2}'

// Extract first balanced JSON block from prose
const block = extractJsonBlock('The result is {"name": "Alice"} as requested.');
// '{"name": "Alice"}'

// Apply full repair pipeline
const repaired = repairOutput('```json\n{"x": 1,}\n```', 'standard');
// '{"x": 1}'

TypeScript

llm-retry is written in strict TypeScript and ships with declaration files. All public types are exported from the package root.

import type {
  Message,
  RetryContext,
  CallLLMFunction,
  ValidationError,
  ValidationResult,
  ValidatorFunction,
  RepairLevel,
  RepairConfig,
  AttemptRecord,
  RetryResult,
  RetryOptions,
  RetrierConfig,
  Retrier,
} from 'llm-retry';

Key Type Definitions

// Message format (standard chat completion format)
interface Message {
  role: 'system' | 'user' | 'assistant';
  content: string;
}

// The function you provide to call your LLM
type CallLLMFunction = (
  messages: Message[],
  context?: RetryContext
) => Promise<string>;

// Context passed to callLLM on each attempt
interface RetryContext {
  attempt: number;
  temperature?: number;
  model?: string;
  maxTokens?: number;
  escalated: boolean;
}

// Your validator must return this discriminated union
type ValidationResult<T> =
  | { success: true; data: T }
  | { success: false; errors: ValidationError[] };

// Validator function signature
type ValidatorFunction<T> = (data: unknown) => ValidationResult<T>;

// Repair level setting
type RepairLevel = 'none' | 'minimal' | 'standard' | 'aggressive';

// Reusable retrier instance
interface Retrier<T> {
  retry(callLLM: CallLLMFunction, messages: Message[]): Promise<RetryResult<T>>;
  readonly config: RetrierConfig;
}

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

llm-retry

Why llm-retry

Installation

Quick Start

Features

API Reference

retryWithValidation

createRetrier

heuristicValidator

repairOutput

extractCodeFence

removeTrailingCommas

extractJsonBlock

buildFeedbackMessage

wrapErrors

formatValidationErrors

Configuration

RetryOptions

RepairConfig

Repair Levels

Feedback Strategies

Error Handling

RetryResult

AttemptRecord

ValidationError

Handling callLLM Exceptions

Advanced Usage

Custom Validator

Observability with onAttempt

Provider Adapter Patterns

Using Repair Utilities Standalone

TypeScript

Key Type Definitions

License