llm-json-fix
v1.0.0
Published
Fix malformed JSON outputs from Large Language Models (LLMs)
Maintainers
Readme
LLM JSON Fix
A comprehensive library for repairing malformed JSON outputs from Large Language Models (LLMs).
Why This Library?
JSON outputs from LLMs are powerful but notoriously inconsistent. Even a small 1% failure rate in JSON formatting can cause system failures that are difficult to debug. This library automatically identifies and repairs common issues in LLM-generated JSON, making your AI integrations more robust and reliable.
Features
- LLM-Specific Repairs: Handles unique issues in AI-generated content
- Markdown Cleanup: Removes code blocks, explanatory text, and other non-JSON content
- Streaming Support: Process infinitely large documents with minimal memory usage
- Schema Flexibility: Works with any JSON structure
- Model-Specific Optimizations: Can be configured for OpenAI, Anthropic, or other LLMs
Installation
# Using npm
npm install llm-json-fix
# Using yarn
yarn add llm-json-fix
# Using pnpm
pnpm add llm-json-fixRequirements
- Node.js 14.0.0 or higher
- Works in both CommonJS and ESM environments
Basic Usage
import { fixLLMJson } from 'llm-json-fix';
// Fix malformed JSON from an LLM
const response = `Here's the JSON you requested: \`\`\`json
{
name: "John",
items: ['apple', 'banana', ...],
active: True
}
\`\`\``;
// Repair the JSON
const fixedJson = fixLLMJson(response);
// Use the fixed JSON
const data = JSON.parse(fixedJson);
console.log(data);Issues Fixed
Incomplete JSON Structures
- Truncated outputs where closing brackets are missing
- Unfinished arrays or objects due to token limits
- Partial final elements
Quote Inconsistencies
- Mixing of single and double quotes
- Unclosed quotes
- Incorrectly escaped quotes within strings
Schema Violations
- Property names without quotes
- Extra or missing commas
- Trailing commas (valid in JavaScript but invalid in JSON)
Markdown Artifacts
- Code block markers (```) included in the JSON
- Explanation text mixed with JSON output
- Markdown formatting within JSON strings
LLM Hallucinations
- Explanatory comments included in the JSON
- "..." or "[more items]" placeholders
- Natural language interruptions mid-JSON
Nested JSON Formatting Issues
- Inconsistent indentation
- Improperly escaped nested JSON strings
- Confusion between string representations of objects and actual objects
API Reference
Regular API
fixLLMJson(text: string, options?: FixLLMJsonOptions): stringOptions
interface FixLLMJsonOptions {
// Whether to apply model-specific fixes (default: true)
applyModelSpecificFixes?: boolean;
// The specific LLM model being used, for optimized repairs
// Supported values: 'openai', 'anthropic', 'general'
model?: 'openai' | 'anthropic' | 'general';
// Whether to preserve comments in the JSON (default: false)
preserveComments?: boolean;
// Whether to be verbose about changes being made
verbose?: boolean;
}Streaming API
For processing large files or streams:
import { createLLMJsonFixStream } from 'llm-json-fix/stream';
import { createReadStream, createWriteStream } from 'fs';
import { pipeline } from 'stream';
const inputStream = createReadStream('broken.json');
const outputStream = createWriteStream('fixed.json');
const fixStream = createLLMJsonFixStream({
bufferSize: 64 * 1024, // 64KB
model: 'openai'
});
pipeline(inputStream, fixStream, outputStream, (err) => {
if (err) {
console.error('Error:', err);
} else {
console.log('JSON successfully repaired!');
}
});Command Line Interface
This package provides a command-line tool for repairing JSON files:
# Install globally
npm install -g llm-json-fix
# Repair a file
llm-json-fix broken.json > fixed.json
# Or with options
llm-json-fix broken.json --output fixed.json --model openai --verboseCLI Options
--version, -v Show application version
--help, -h Display help for command
--output, -o Output file
--overwrite Overwrite the input file
--buffer Buffer size in bytes, for example 64K (default) or 1M
--model Specify the LLM model (openai, anthropic, general)
--verbose Show detailed repair information
--preserve-comments Preserve comments in the outputExamples
See the examples directory for more usage examples:
Common Patterns & Integration Tips
With OpenAI
try {
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [
{ role: "system", content: "Respond with valid JSON only." },
{ role: "user", content: prompt }
]
});
const content = response.choices[0].message.content;
const fixedJson = fixLLMJson(content, { model: 'openai' });
const data = JSON.parse(fixedJson);
// Use the data...
} catch (error) {
console.error('Error:', error);
}With Anthropic Claude
try {
const response = await anthropic.messages.create({
model: "claude-3-opus-20240229",
max_tokens: 4000,
messages: [
{ role: "user", content: "Return this data as JSON: " + prompt }
],
system: "Return only valid JSON data with no additional text."
});
const content = response.content[0].text;
const fixedJson = fixLLMJson(content, { model: 'anthropic' });
const data = JSON.parse(fixedJson);
// Use the data...
} catch (error) {
console.error('Error:', error);
}License
Package Contents
The npm package includes:
- CommonJS build for Node.js environments
- ESM build for modern JavaScript environments
- UMD build for browser usage
- TypeScript type definitions
- CLI executable
- Full documentation
For more information, see the changelog.
