npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

structured-llm

v0.3.1

Published

Provider-agnostic TypeScript library for Zod-validated, fully-typed structured output from any LLM

Readme


npm install structured-llm zod
import { generate } from "structured-llm";
import { z } from "zod";

const { data } = await generate({
  client: openai,       // pass your existing OpenAI / Anthropic / Gemini / Mistral client
  model: "gpt-4o-mini",
  schema: z.object({
    sentiment: z.enum(["positive", "negative", "neutral"]),
    score: z.number().min(0).max(1),
    tags: z.array(z.string()),
  }),
  prompt: "Analyze: The new MacBook completely changed how I work.",
});

console.log(data.sentiment); // "positive"
console.log(data.score);     // 0.94
console.log(data.tags);      // ["productivity", "hardware", "apple"]
// fully typed — no casting, no guessing

Why another structured output library?

You have a few options today:

| | structured-llm | Vercel AI SDK | instructor-js | |---|---|---|---| | Bring your own client | yes | no (their SDK) | partial | | Zero runtime dependencies | yes | no | no | | 14 providers | yes | yes | OpenAI only | | Streaming partial objects | yes | yes | no | | Fallback chain | yes | no | no | | Retry with error feedback | yes | basic | yes | | Standard Schema (Valibot, ArkType) | yes | no | no | | Custom schema (no Zod) | yes | no | no | | Works with local Ollama | yes | limited | no | | AWS Bedrock | yes | no | no |

structured-llm has one job: take any LLM client you already have, take a Zod schema you already wrote, give back a typed object. No ecosystem lock-in.


Table of contents


Installation

npm install structured-llm zod
# or
pnpm add structured-llm zod
# or
yarn add structured-llm zod

Install only the provider SDKs you actually use:

npm install openai                       # OpenAI, Groq, xAI, Together, Fireworks, Ollama, Azure
npm install @anthropic-ai/sdk            # Anthropic
npm install @google/genai                # Gemini
npm install @mistralai/mistralai         # Mistral
npm install cohere-ai                    # Cohere
npm install @aws-sdk/client-bedrock-runtime  # AWS Bedrock

Requires: Node.js 18+, TypeScript 5+ (strict mode recommended)


Core functions

generate

Extracts a single structured object from the LLM.

import OpenAI from "openai";
import { z } from "zod";
import { generate } from "structured-llm";

const openai = new OpenAI(); // reads OPENAI_API_KEY from env

const InvoiceSchema = z.object({
  vendor: z.string(),
  amount: z.number(),
  currency: z.string().length(3),
  dueDate: z.string().describe("ISO 8601 date"),
  lineItems: z.array(z.object({
    description: z.string(),
    quantity: z.number(),
    unitPrice: z.number(),
  })),
  isPaid: z.boolean(),
});

const { data, usage } = await generate({
  client: openai,
  model: "gpt-4o-mini",
  schema: InvoiceSchema,
  prompt: invoiceText,
  systemPrompt: "You are a precise invoice parser.",
  temperature: 0,
  maxRetries: 3,
  trackUsage: true,
});

// data is fully typed as z.infer<typeof InvoiceSchema>
console.log(data.vendor);        // "Acme Corp"
console.log(data.lineItems[0]);  // { description: "...", quantity: 2, unitPrice: 49.99 }
console.log(usage?.estimatedCostUsd); // 0.000043

All options:

generate({
  // Provider — one of these two forms
  client: openai,              // pass an existing client (auto-detected)
  // OR
  provider: "openai",          // reads API key from env (OPENAI_API_KEY)
  apiKey: "sk-...",            // or pass the key directly
  baseURL: "...",              // optional custom endpoint

  model: "gpt-4o-mini",        // required
  schema: MyZodSchema,         // required — Zod, Standard Schema, or custom schema

  // Input — use prompt, messages, or both
  prompt: "...",
  messages: [
    { role: "system", content: "..." },
    { role: "user", content: "..." },
  ],
  systemPrompt: "...",         // shorthand for a system message

  // Extraction
  mode: "auto",                // "auto" | "tool-calling" | "json-mode" | "prompt-inject"

  // Retry
  maxRetries: 3,
  retryOptions: {
    strategy: "exponential",   // "immediate" (default) | "linear" | "exponential"
    baseDelayMs: 500,
  },

  // Generation params
  temperature: 0,
  maxTokens: 1000,
  topP: 1,                     // nucleus sampling
  seed: 42,                    // reproducible outputs (where supported)

  // Cancellation
  signal: abortController.signal,

  // Observability
  trackUsage: false,
  hooks: { ... },

  // Fallback
  fallbackChain: [ ... ],
});

generateArray

Extracts a list of items. Pass the schema for a single item, get back an array.

import { generateArray } from "structured-llm";
import { z } from "zod";

const TransactionSchema = z.object({
  date: z.string(),
  merchant: z.string(),
  amount: z.number(),
  category: z.enum(["food", "transport", "shopping", "utilities", "other"]),
});

const { data } = await generateArray({
  client: openai,
  model: "gpt-4o-mini",
  schema: TransactionSchema,     // schema for ONE transaction
  prompt: bankStatementText,
  minItems: 1,                   // hint to the LLM
  maxItems: 100,
});

// data is Transaction[]
const total = data.reduce((sum, t) => sum + t.amount, 0);
console.log(`${data.length} transactions, total: $${total.toFixed(2)}`);

generateStream

Streams the response, yielding partial objects as fields come in. Useful for long outputs or real-time UIs.

import { generateStream } from "structured-llm";
import { z } from "zod";

const ReportSchema = z.object({
  title: z.string(),
  executiveSummary: z.string(),
  sections: z.array(z.object({
    heading: z.string(),
    content: z.string(),
    keyPoints: z.array(z.string()),
  })),
  conclusion: z.string(),
  riskLevel: z.enum(["low", "medium", "high"]),
});

const stream = generateStream({
  client: openai,
  model: "gpt-4o",
  schema: ReportSchema,
  prompt: "Write a comprehensive market analysis for the EV industry in 2025.",
  signal: request.signal,   // cancel when the HTTP request is aborted
});

// Iterate over partial updates
for await (const event of stream) {
  if (event.isDone) {
    console.log("Complete:", event.partial.title);
    console.log("Sections:", event.partial.sections?.length);
  } else {
    // Partial<ReportSchema> — render what you have so far
    process.stdout.write(".");
  }
}

// Or just await the final validated result
const { data } = await stream.result;

Automatically retries on rate limits (429, 502, 503, 529) with exponential backoff, rolling back any partial events before retrying.


generateArrayStream

Stream array items as they complete. Each event contains the cumulative list of fully-parsed items so far.

import { generateArrayStream } from "structured-llm";
import { z } from "zod";

const stream = generateArrayStream({
  client: openai,
  model: "gpt-4o",
  schema: z.object({ name: z.string(), price: z.number(), category: z.string() }),
  prompt: "List 20 top-selling electronics products for 2025",
});

for await (const { items, isDone } of stream) {
  console.log(`${items.length} items loaded...`);
  if (isDone) renderFinalList(items);
}

// Or await the complete result directly
const { data } = await stream.result;

Each event:

interface ArrayStreamEvent<T> {
  items: T[];          // cumulative array of complete, validated items
  isDone: boolean;
  usage?: UsageInfo;   // only on the final event when trackUsage: true
}

generateBatch

Process many inputs against the same schema with controlled concurrency. Handles partial failures, progress callbacks, and aggregated usage stats.

import { generateBatch } from "structured-llm";

const { items, succeeded, failed, totalUsage } = await generateBatch({
  client: openai,
  model: "gpt-4o-mini",
  schema: SentimentSchema,
  inputs: reviews.map((text) => ({ prompt: text })),
  concurrency: 5,          // max parallel API calls (default 3)
  continueOnError: true,   // don't throw on individual failures (default true)
  onProgress: ({ completed, total, succeeded, failed }) => {
    console.log(`${completed}/${total} (${failed} failed)`);
  },
});

console.log(`${succeeded.length}/${items.length} succeeded`);
console.log(`Total cost: $${totalUsage?.estimatedCostUsd?.toFixed(4)}`);

// Results are in original input order
items.forEach(({ index, data, error, durationMs }) => {
  if (error) console.log(`[${index}] failed: ${error.message}`);
  else console.log(`[${index}] ${data.sentiment} (${durationMs}ms)`);
});

generateMultiSchema

Run the same input through multiple Zod schemas simultaneously. Useful when you need different structured views of the same document.

import { generateMultiSchema } from "structured-llm";

const { results, totalUsage } = await generateMultiSchema({
  client: openai,
  model: "gpt-4o-mini",
  prompt: contractText,
  schemas: {
    keyTerms: KeyTermsSchema,      // parties, dates, governing law
    risks: RiskAssessmentSchema,   // red flags, severity scores
    obligations: ObligationSchema, // what each party must do
  },
  parallel: true,          // run all schemas concurrently (default true)
  continueOnError: true,   // individual schema failures don't abort others
});

console.log(results.keyTerms.data);    // KeyTerms | undefined
console.log(results.risks.data);       // RiskAssessment | undefined
console.log(results.obligations.data); // Obligations | undefined
console.log(results.risks.error);      // Error | undefined

createClient

Pre-configure a client once, call it many times. Useful when you're making lots of calls with the same provider/model/settings.

import { createClient } from "structured-llm";
import OpenAI from "openai";

const llm = createClient({
  client: new OpenAI(),
  model: "gpt-4o-mini",
  defaultOptions: {
    temperature: 0,
    maxRetries: 2,
    trackUsage: true,
    hooks: {
      onSuccess: ({ usage }) => {
        db.insert({ tokens: usage?.totalTokens, cost: usage?.estimatedCostUsd });
      },
    },
  },
});

// All calls inherit the defaults — override per-call as needed
const { data: sentiment } = await llm.generate({
  schema: SentimentSchema,
  prompt: "Analyze this review: ...",
});

const { data: entities } = await llm.generateArray({
  schema: EntitySchema,
  prompt: "Extract all named entities from: ...",
  temperature: 0.2,              // overrides defaultOptions.temperature
});

const stream = llm.generateStream({
  schema: ReportSchema,
  prompt: "Write a report on...",
});

// All helpers are also available on the client
const result = await llm.classify({ ... });
const data = await llm.extract({ ... });
const { results } = await llm.generateMultiSchema({ ... });
const batchResult = await llm.generateBatch({ ... });

High-level helpers

classify

Classify text into one of your categories. No schema boilerplate needed — pass an array of labels and get back a typed result.

import { classify } from "structured-llm";

const { label, confidence, reasoning } = await classify({
  client: openai,
  model: "gpt-4o-mini",
  prompt: "My payment was charged twice last week.",
  options: [
    { value: "billing", description: "Charge, refund, subscription issues" },
    { value: "auth", description: "Login, password, account access" },
    { value: "bug", description: "App not working as expected" },
    { value: "how-to", description: "Questions about how to use the product" },
  ],
  includeConfidence: true,   // 0–1 confidence score
  includeReasoning: true,    // one-sentence explanation
  allowMultiple: false,      // set true for multi-label classification
});

console.log(label);      // "billing"
console.log(confidence); // 0.97
console.log(reasoning);  // "User reports a duplicate charge, a billing issue."

With allowMultiple: true, the response has a labels array:

const { labels } = await classify({
  ...,
  allowMultiple: true,
  prompt: "URGENT: can't log in and my card was charged $500 I didn't authorize",
  options: ["billing", "auth", "urgent", "fraud"],
});
// labels: ["billing", "auth", "urgent", "fraud"]

extract

Extract specific fields from free-form text without writing a full Zod schema. Fields are optional by default — the LLM omits what it can't find.

import { extract } from "structured-llm";

const data = await extract({
  client: openai,
  model: "gpt-4o-mini",
  prompt: invoiceText,
  fields: {
    // shorthand — just a type string
    invoiceNumber: "string",
    totalAmount: "number",
    issueDate: "date",      // "string" | "number" | "boolean" | "date" | "email" | "phone" | "url" | "integer"

    // full FieldDef for more control
    vendorEmail: {
      type: "email",
      description: "Vendor's billing email address",
      required: true,        // validation error if missing
    },
    status: {
      type: "string",
      options: ["draft", "sent", "paid", "overdue"],  // enum
    },
  },
  requireAll: false,     // make all fields required at once
});

console.log(data.invoiceNumber); // "INV-2024-00842"
console.log(data.totalAmount);   // 10476
console.log(data.issueDate);     // "2024-03-05"

createTemplate

Bind a prompt template to a schema and config. Reuse it across your app with different variable substitutions.

import { createTemplate } from "structured-llm";

const analyzeDoc = createTemplate({
  template: "Analyze this {{docType}} from {{company}}:\n\n{{content}}",
  schema: AnalysisSchema,
  client: openai,
  model: "gpt-4o-mini",
  systemPrompt: "You are a business analyst.",
  temperature: 0,
});

// run with variable substitution
const { data } = await analyzeDoc.run({
  docType: "contract",
  company: "Acme Corp",
  content: contractText,
});

// run as array extraction
const { data: items } = await analyzeDoc.runArray({
  docType: "meeting notes",
  company: "TechCo",
  content: notesText,
});

// preview the rendered prompt (no API call)
const prompt = analyzeDoc.render({ docType: "invoice", company: "Acme", content: "..." });

Variables use {{double_braces}} syntax. An error is thrown if a variable is missing at runtime.


withCache

Wrap generate() with TTL-based memoization. Identical prompts + model + schema combinations skip the API and return the cached result.

import { withCache } from "structured-llm";

const cachedGenerate = withCache({
  ttl: 5 * 60 * 1000,  // 5 minute TTL (default)
  debug: true,          // log cache hits/misses
  // store: customStore  // optional custom cache backend (e.g. Redis)
  // keyFn: (opts) => myKey  // optional custom cache key function
});

const r1 = await cachedGenerate({ client, model, schema, prompt: "same question" });
const r2 = await cachedGenerate({ client, model, schema, prompt: "same question" });

console.log(r1.fromCache); // false — hit the API
console.log(r2.fromCache); // true — served from cache, no API call

// Use a shared store across multiple withCache instances
import { createCacheStore } from "structured-llm";
const store = createCacheStore();
const cachedA = withCache({ store, ttl: 60_000 });
const cachedB = withCache({ store, ttl: 60_000 });

The cache key includes model, prompt/messages, and schema JSON — different schemas for the same prompt are cached separately.


Providers

The library auto-detects the provider from your client instance. Just pass it in.

Native providers

| Provider | Install | Client class | |---|---|---| | OpenAI | npm i openai | new OpenAI() | | Anthropic | npm i @anthropic-ai/sdk | new Anthropic() | | Gemini | npm i @google/genai | new GoogleGenAI({ apiKey }) | | Mistral | npm i @mistralai/mistralai | new Mistral({ apiKey }) | | Cohere | npm i cohere-ai | new CohereClient({ token }) | | AWS Bedrock | npm i @aws-sdk/client-bedrock-runtime | new BedrockRuntimeClient({ region }) |

OpenAI-compatible providers

These all use the OpenAI SDK pointed at a different endpoint:

import OpenAI from "openai";

// Groq — fastest inference, great for real-time apps
const groq = new OpenAI({ apiKey: process.env.GROQ_API_KEY, baseURL: "https://api.groq.com/openai/v1" });
generate({ client: groq, model: "llama-3.3-70b-versatile", ... })

// xAI (Grok)
const xai = new OpenAI({ apiKey: process.env.XAI_API_KEY, baseURL: "https://api.x.ai/v1" });

// Together AI — large selection of open models
const together = new OpenAI({ apiKey: process.env.TOGETHER_API_KEY, baseURL: "https://api.together.xyz/v1" });
generate({ client: together, model: "meta-llama/Llama-3.3-70B-Instruct-Turbo", ... })

// Fireworks AI, Perplexity — same pattern

// Ollama — local models, completely free
const ollama = new OpenAI({ apiKey: "ollama", baseURL: "http://localhost:11434/v1" });
generate({ client: ollama, model: "llama3.2", mode: "json-mode" })

// Azure OpenAI
const azure = new OpenAI({
  apiKey: process.env.AZURE_OPENAI_API_KEY,
  baseURL: "https://your-resource.openai.azure.com/openai/deployments/gpt-4o",
});

AWS Bedrock

import { BedrockRuntimeClient } from "@aws-sdk/client-bedrock-runtime";
import { generate } from "structured-llm";

const bedrock = new BedrockRuntimeClient({ region: "us-east-1" });

const { data } = await generate({
  client: bedrock,
  model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
  schema: MySchema,
  prompt: "...",
});

Or use the provider string to auto-initialize from environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION):

generate({
  provider: "bedrock",
  model: "amazon.nova-pro-v1:0",
  schema: MySchema,
  prompt: "...",
})

Auto-initialize from environment

Skip client creation entirely — pass a provider string and the library reads from env:

generate({
  provider: "openai",     // OPENAI_API_KEY
  provider: "anthropic",  // ANTHROPIC_API_KEY
  provider: "gemini",     // GEMINI_API_KEY
  provider: "mistral",    // MISTRAL_API_KEY
  provider: "groq",       // GROQ_API_KEY
  provider: "xai",        // XAI_API_KEY
  provider: "together",   // TOGETHER_API_KEY
  provider: "fireworks",  // FIREWORKS_API_KEY
  provider: "perplexity", // PERPLEXITY_API_KEY
  provider: "cohere",     // COHERE_API_KEY
  provider: "bedrock",    // AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY + AWS_REGION
  provider: "ollama",     // no key needed
  model: "...",
  schema: ...,
  prompt: "...",
})

Extraction modes

The library automatically picks the best extraction mode based on the model's capabilities. You can also set it explicitly.

| Mode | How it works | Reliability | |---|---|---| | tool-calling | Schema becomes a tool definition. LLM is forced to "call" it, guaranteeing JSON. | Highest | | json-mode | Sets response_format: json_object. Schema embedded in system prompt. | High | | prompt-inject | Schema appended to user prompt. JSON extracted from response with fallback parsing. | Good |

Auto-selection logic:

Does the model support tool calling?
  YES → tool-calling  (GPT-4o, Claude 3+, Gemini 1.5+, Mistral Large, Groq)
  NO
    Does the model support JSON mode?
      YES → json-mode  (GPT-3.5, Gemini Flash, Perplexity, most modern models)
      NO  → prompt-inject  (works on any model, including Ollama local models)

Override when needed:

generate({ ..., mode: "json-mode" })
generate({ ..., mode: "prompt-inject" })

Retry logic

On invalid JSON or schema validation failure, the library retries automatically. Each retry includes the validation errors so the LLM can fix its own output.

Attempt 1:  LLM returns { "score": 1.8, "sentiment": "mixed" }
            → validation fails: score must be ≤ 1, sentiment must be "positive"|"negative"|"neutral"

Attempt 2:  "Your previous response had errors:
             - score: Number must be less than or equal to 1
             - sentiment: Invalid enum value
            Please fix and respond with corrected JSON."
            → LLM returns { "score": 0.8, "sentiment": "positive" }  ✓

Rate-limit errors (429, 502, 503, 529) are also retried automatically with exponential backoff — no configuration needed.

generate({
  ...,
  maxRetries: 3,          // default: 3 (set 0 to disable)
  retryOptions: {
    strategy: "exponential",   // "immediate" (default) | "linear" | "exponential"
    baseDelayMs: 500,          // base delay for linear/exponential strategies
  },
})

Fallback chain

Define a list of provider+model pairs to try in order. Falls back automatically if the primary provider fails (network error, rate limit, outage, etc.).

import OpenAI from "openai";
import Anthropic from "@anthropic-ai/sdk";

generate({
  // primary
  client: new OpenAI(),
  model: "gpt-4o",

  fallbackChain: [
    // first fallback — cheaper model, same provider
    { client: new OpenAI(), model: "gpt-4o-mini" },

    // second fallback — different provider
    { client: new Anthropic(), model: "claude-haiku-4-5-20251001" },

    // last resort — free local model
    { provider: "ollama", model: "llama3.2" },
  ],

  schema: ...,
  prompt: "...",
  hooks: {
    onError: ({ error }) => console.log("Primary failed, trying fallback:", error.message),
  },
})

Usage tracking

Pass trackUsage: true to get token counts and a cost estimate back with every call.

const { data, usage } = await generate({
  ...,
  trackUsage: true,
});

console.log(usage);
// {
//   promptTokens: 312,
//   completionTokens: 95,
//   totalTokens: 407,
//   estimatedCostUsd: 0.0000891,    // based on published pricing
//   latencyMs: 843,
//   attempts: 1,
//   model: "gpt-4o-mini",
//   provider: "openai",
// }

The cost estimate uses a built-in pricing table updated with each release. For unknown models, estimatedCostUsd is undefined.

Use the onSuccess hook to pipe usage data to your analytics or database:

createClient({
  ...,
  defaultOptions: {
    trackUsage: true,
    hooks: {
      onSuccess: ({ usage }) => {
        myAnalytics.record({
          model: usage?.model,
          tokens: usage?.totalTokens,
          cost: usage?.estimatedCostUsd,
          latency: usage?.latencyMs,
        });
      },
    },
  },
})

Hooks

Hooks run at each stage of the request lifecycle. Useful for logging, metrics, cost tracking, and debugging.

generate({
  ...,
  hooks: {
    // Fires before each LLM request (including retries)
    onRequest: ({ messages, model, provider, attempt }) => {
      logger.debug("LLM request", { model, provider, attempt, messageCount: messages.length });
    },

    // Fires when the LLM responds (before parsing/validation)
    onResponse: ({ rawResponse, attempt, model }) => {
      // useful for debugging what the LLM actually returned
    },

    // Fires on each partial update during generateStream() / generateArrayStream()
    onChunk: ({ partial, model }) => {
      broadcastToWebSocket(partial);
    },

    // Fires when a retry is about to happen
    onRetry: ({ attempt, maxRetries, error, model }) => {
      logger.warn(`Retrying (${attempt}/${maxRetries}): ${error}`);
    },

    // Fires when the final result passes validation
    onSuccess: ({ result, usage }) => {
      metrics.increment("llm.success", { model: usage?.model });
    },

    // Fires when all attempts fail
    onError: ({ error, allAttempts }) => {
      logger.error("LLM extraction failed", { error: error.message, attempts: allAttempts });
      alerting.send("LLM failure", error);
    },
  },
})

When using createClient, global hooks (set on the client) and per-call hooks both fire — you don't have to choose.

const llm = createClient({
  ...,
  defaultOptions: {
    hooks: { onSuccess: globalMetrics },   // always runs
  },
});

llm.generate({
  ...,
  hooks: { onSuccess: localLog },          // also runs, in addition to globalMetrics
});

Error handling

All errors extend StructuredLLMError so you can catch them broadly or specifically.

import {
  StructuredLLMError,   // base class
  ValidationError,      // schema validation failed after all retries
  ParseError,           // LLM returned non-JSON after all retries
  ProviderError,        // upstream API error (rate limit, auth, network)
  MaxRetriesError,      // exceeded maxRetries (shouldn't normally see this)
  SchemaError,          // invalid schema passed in
  MissingInputError,    // no prompt or messages provided
} from "structured-llm";

try {
  const { data } = await generate({ ... });
} catch (err) {
  if (err instanceof ValidationError) {
    // The LLM consistently returned data that didn't match your schema
    console.log(err.issues);        // array of validation error strings
    console.log(err.lastResponse);  // the raw JSON string the LLM returned
    console.log(err.attempts);      // how many times it tried (maxRetries + 1)
  }

  if (err instanceof ParseError) {
    // The LLM kept returning non-JSON (rare with tool-calling mode)
    console.log(err.lastResponse);
  }

  if (err instanceof ProviderError) {
    // The provider API returned an error
    console.log(err.provider);      // "openai"
    console.log(err.statusCode);    // 429 (rate limit), 401 (auth), etc.
    console.log(err.originalError); // the raw error from the SDK
  }

  if (err instanceof StructuredLLMError) {
    // catch-all for any structured-llm error
  }
}

Custom schemas

You don't have to use Zod. Any object with a jsonSchema and parse function works:

// Hand-rolled validator
const { data } = await generate({
  ...,
  schema: {
    jsonSchema: {
      type: "object",
      properties: {
        score: { type: "number", minimum: 0, maximum: 1 },
        label: { type: "string", enum: ["spam", "ham"] },
      },
      required: ["score", "label"],
    },
    parse: (input) => {
      const d = input as { score: number; label: string };
      if (d.score < 0 || d.score > 1) throw new Error("score out of range");
      if (!["spam", "ham"].includes(d.label)) throw new Error("invalid label");
      return d;
    },
  },
});
// With TypeBox
import { Type, type Static } from "@sinclair/typebox";
import { TypeCompiler } from "@sinclair/typebox/compiler";

const UserSchema = Type.Object({
  name: Type.String(),
  age: Type.Number({ minimum: 0 }),
  role: Type.Union([Type.Literal("admin"), Type.Literal("user")]),
});
type User = Static<typeof UserSchema>;
const compiled = TypeCompiler.Compile(UserSchema);

const { data } = await generate({
  client: openai,
  model: "gpt-4o-mini",
  schema: {
    jsonSchema: UserSchema,
    parse: (input: unknown): User => {
      const errors = [...compiled.Errors(input)];
      if (errors.length) throw new Error(errors.map((e) => e.message).join(", "));
      return input as User;
    },
  },
  prompt: "...",
});

Standard Schema (Valibot, ArkType)

Libraries that implement the Standard Schema v1 spec are auto-detected and work without any adapters. This includes Valibot, ArkType, Effect Schema, and Zod v4.

import * as v from "valibot";
import { generate } from "structured-llm";

const PersonSchema = v.object({
  name: v.string(),
  age: v.number(),
  email: v.pipe(v.string(), v.email()),
});

// Just pass it — no adapter needed
const { data } = await generate({
  client: openai,
  model: "gpt-4o-mini",
  schema: PersonSchema,
  prompt: "Extract: Alice Smith, 28, [email protected]",
});
// data.name, data.age, data.email are fully typed
import { type } from "arktype";

const UserType = type({ name: "string", score: "number" });

const { data } = await generate({
  client: openai,
  model: "gpt-4o-mini",
  schema: UserType,
  prompt: "...",
});

If you need to convert explicitly, use fromStandardSchema:

import { fromStandardSchema } from "structured-llm";
const schema = fromStandardSchema(valibotSchema);

Framework integrations

Next.js App Router

// app/api/analyze/route.ts — simple JSON endpoint
import { createStructuredRoute } from "structured-llm/next";
import { z } from "zod";

export const POST = createStructuredRoute({
  provider: "openai",
  model: "gpt-4o-mini",
  schema: z.object({
    category: z.enum(["bug", "feature", "question"]),
    priority: z.enum(["low", "medium", "high"]),
    summary: z.string(),
  }),
});
// Request:  POST /api/analyze  { "prompt": "App crashes on login" }
// Response: { "data": { "category": "bug", "priority": "high", "summary": "..." } }
// app/api/stream/route.ts — NDJSON streaming endpoint
import { createStreamingRoute } from "structured-llm/next";

export const POST = createStreamingRoute({
  provider: "openai",
  model: "gpt-4o",
  schema: ReportSchema,
});
// Streams: {"partial":{...},"isDone":false}\n{"partial":{...},"isDone":true,...}\n
// As a server action
import { withStructured } from "structured-llm/next";

export const classifyTicket = withStructured({
  provider: "openai",
  model: "gpt-4o-mini",
  schema: TicketSchema,
});

const result = await classifyTicket({ prompt: ticket.description });

Hono

import { Hono } from "hono";
import { structuredLLM, createStructuredHandler, createStreamingHandler } from "structured-llm/hono";

const app = new Hono();

// Middleware — attaches result to context, calls next()
app.post(
  "/extract",
  structuredLLM({
    provider: "openai",
    model: "gpt-4o-mini",
    schema: ContactSchema,
    promptFromBody: (body) => `Extract contact info from: ${body.text}`,
  }),
  (c) => c.json(c.get("structuredResult"))
);

// Route handler — responds directly
app.post("/analyze", createStructuredHandler({ provider: "openai", model: "gpt-4o-mini", schema: AnalysisSchema }));

// Streaming route handler
app.post("/stream", createStreamingHandler({ provider: "openai", model: "gpt-4o", schema: ReportSchema }));

Express

import express from "express";
import { structuredMiddleware, createStructuredHandler, createStreamingHandler } from "structured-llm/express";

const app = express();
app.use(express.json());

// Middleware — attaches to req.structured, calls next()
app.post(
  "/classify",
  structuredMiddleware({
    provider: "openai",
    model: "gpt-4o-mini",
    schema: IntentSchema,
    promptFromBody: (body) => body.message,
  }),
  (req, res) => res.json(req.structured)
);

// Route handler — responds directly
app.post("/analyze", createStructuredHandler({ provider: "openai", model: "gpt-4o-mini", schema: AnalysisSchema }));

// Streaming route handler (NDJSON)
app.post("/stream", createStreamingHandler({ provider: "openai", model: "gpt-4o", schema: ReportSchema }));

Model utilities

import { getModelCapabilities, listSupportedModels } from "structured-llm";

// check a specific model
const caps = getModelCapabilities("gpt-4o-mini");
// {
//   provider: "openai",
//   toolCalling: true,
//   jsonMode: true,
//   streaming: true,
//   contextWindow: 128000,
//   inputCostPer1M: 0.15,
//   outputCostPer1M: 0.6
// }

// list all supported models for a provider
listSupportedModels({ provider: "anthropic" });
// ["claude-opus-4-6", "claude-sonnet-4-6", "claude-3-7-sonnet-20250219", "claude-haiku-4-5-20251001", ...]

// list everything
listSupportedModels();
// all 40+ registered models

Newly added models in v0.2.0: gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3, o4-mini, claude-3-7-sonnet-20250219, claude-haiku-4-5-20251001, gemini-2.5-pro, gemini-2.5-flash, llama-4-scout-17b-16e-instruct, llama-4-maverick-17b-128e-instruct


Examples

The examples/ directory has 40 full runnable examples:

git clone https://github.com/piyushgupta344/structured-llm
cd structured-llm && pnpm install

OPENAI_API_KEY=sk-... npx tsx examples/01-sentiment-analysis.ts

Core features:

| Example | What it demonstrates | |---|---| | 01-sentiment-analysis.ts | Batch sentiment scoring with confidence intervals | | 02-data-extraction.ts | Parse meeting notes into structured agenda / action items | | 03-multi-provider.ts | Run the same extraction across OpenAI, Anthropic, Gemini | | 04-fallback-chain.ts | Automatic fallback when primary provider is unavailable | | 05-streaming.ts | Real-time partial updates while generating a long report | | 06-generate-array.ts | Parse a bank statement into typed transaction objects | | 07-create-client.ts | Reusable client for an email triage pipeline | | 08-custom-schema.ts | Bring your own validator instead of Zod | | 09-fintech-analysis.ts | Parse earnings call transcripts and classify headlines | | 10-ollama-local.ts | Run everything locally with Ollama — zero API cost |

Document processing:

| Example | What it demonstrates | |---|---| | 11-resume-parsing.ts | Extract skills, experience, and education from a CV | | 12-invoice-extraction.ts | extract() helper — parse billing data from invoice text | | 15-legal-contract-analysis.ts | generateMultiSchema() — key terms + risk assessment from one document | | 18-medical-notes-extraction.ts | Extract vitals, symptoms, medications from clinical notes | | 28-academic-paper-analysis.ts | Metadata + contributions from research papers | | 39-multi-schema-document.ts | generateMultiSchema() — summary + quotes + actions from one document |

Classification & routing:

| Example | What it demonstrates | |---|---| | 13-content-moderation.ts | Multi-category content safety scoring | | 14-support-ticket-routing.ts | classify() — route tickets to the right team with confidence | | 34-multilingual-feedback.ts | generateBatch() — detect language, translate, and classify in bulk | | 38-bug-triage.ts | generateBatch() — severity, priority, and owner assignment |

Data pipelines:

| Example | What it demonstrates | |---|---| | 17-product-catalog-normalization.ts | generateBatch() — normalize messy product data | | 29-real-estate-listing.ts | generateArray() — parse multiple property listings at once | | 33-competitor-analysis.ts | generateBatch() — competitive intelligence at scale | | 37-caching-repeated-queries.ts | withCache() — avoid redundant API calls for identical inputs | | 40-market-research-template.ts | createTemplate() — run the same research framework across markets |


Contributing

Contributions are welcome. See CONTRIBUTING.md for how to get started, what kind of PRs are accepted, and how to add a new provider.


License

MIT