npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ordis-dev/ordis

v0.6.1

Published

Schema-first LLM extraction tool that turns unstructured text into validated structured data

Readme

Ordis

Ordis is a local-first tool and library that turns messy, unstructured text into clean, structured data using a schema-driven extraction pipeline powered by LLMs. You give it a schema that describes the fields you expect, point it at some raw text, and choose any OpenAI-compatible model. Ordis builds the prompt, calls the model, validates the output, and returns either a correct structured record or a clear error.

Ordis does for LLM extraction what Prisma does for databases: strict schemas, predictable output and no more glue code.

Status

✅ CLI functional - Core extraction pipeline working with real LLMs. Ready for testing and feedback.

✅ Programmatic API - Can be used as an npm package in Node.js applications.

Features

  • Local-first extraction: Supports Ollama, LM Studio, or any OpenAI-compatible endpoint
  • Schema-first workflow: Define your data structure upfront
  • Deterministic output: Returns validated records or structured failures
  • Token budget awareness: Automatic token counting with warnings and limits
  • HTML preprocessing: Strip noise from web pages before extraction
  • Dual-purpose: Use as a CLI or import as a library
  • TypeScript support: Full type definitions included

Example

ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base http://localhost:11434/v1 \
  --model llama3.1:8b \
  --debug

Sample schema (invoice.schema.json):

{
  "fields": {
    "invoice_id": { "type": "string" },
    "amount": { "type": "number" },
    "currency": { "type": "string", "enum": ["USD", "SGD", "EUR"] },
    "date": { "type": "string", "format": "date-time", "optional": true }
  }
}

Model Compatibility

Works with any service exposing an OpenAI-compatible API:

  • Ollama
  • LM Studio
  • OpenRouter
  • Mistral
  • Groq
  • OpenAI
  • vLLM servers

Installation

From npm (recommended)

Install globally to use the CLI anywhere:

npm install -g @ordis-dev/ordis
ordis --help

Or install locally in your project:

npm install @ordis-dev/ordis

From Source

git clone https://github.com/ordis-dev/ordis
cd ordis
npm install
npm run build
node dist/cli.js --help

Usage

CLI Usage

Extract data from text using a schema:

ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base http://localhost:11434/v1 \
  --model llama3.1:8b \
  --debug

With API key (for providers like OpenAI, Deepseek, etc.):

ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base https://api.deepseek.com/v1 \
  --model deepseek-chat \
  --api-key your-api-key-here

Enable JSON mode (for reliable JSON responses):

# OpenAI and compatible providers
ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base https://api.openai.com/v1 \
  --model gpt-4o-mini \
  --api-key your-api-key \
  --json-mode

# Ollama (recommended: use /v1 endpoint for portability)
ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base http://localhost:11434/v1 \
  --model qwen2.5:32b \
  --json-mode

💡 Note: For Ollama, use http://localhost:11434/v1 for maximum portability across providers. Both /v1 (OpenAI-compatible) and /api (native) endpoints work correctly with JSON mode.

Programmatic Usage

Use ordis as a library in your Node.js application:

import { extract, loadSchema, LLMClient } from '@ordis-dev/ordis';

// Load schema from file
const schema = await loadSchema('./invoice.schema.json');

// Or create schema from object
import { loadSchemaFromObject } from 'ordis-cli';
const schema = loadSchemaFromObject({
  fields: {
    invoice_id: { type: 'string' },
    amount: { type: 'number' },
    currency: { type: 'string', enum: ['USD', 'EUR', 'SGD'] }
  }
});

// Configure LLM
const llmConfig = {
  baseURL: 'http://localhost:11434/v1',
  model: 'llama3.2:3b'
};

// Extract data
const result = await extract({
  input: 'Invoice #INV-2024-0042 for $1,250.00 USD',
  schema,
  llmConfig
});

if (result.success) {
  console.log(result.data);
  // { invoice_id: 'INV-2024-0042', amount: 1250, currency: 'USD' }
  console.log('Confidence:', result.confidence);
} else {
  console.error('Extraction failed:', result.errors);
}

Using LLM Presets:

import { extract, loadSchema, LLMPresets } from '@ordis-dev/ordis';

const schema = await loadSchema('./schema.json');

// Use preset configurations
const result = await extract({
  input: text,
  schema,
  llmConfig: LLMPresets.ollama('llama3.2:3b')
  // Or: LLMPresets.openai(apiKey, 'gpt-4o-mini')
  // Or: LLMPresets.lmStudio('local-model')
});

// Enable JSON mode (provider auto-detected from baseURL)
const resultWithJsonMode = await extract({
  input: text,
  schema,
  llmConfig: {
    baseURL: 'http://localhost:11434/v1',
    model: 'qwen2.5:32b',
    jsonMode: true  // Auto-detects Ollama, uses format: "json"
  }
});

// Explicit provider override
const resultExplicit = await extract({
  input: text,
  schema,
  llmConfig: {
    baseURL: 'https://api.openai.com/v1',
    model: 'gpt-4o-mini',
    apiKey: process.env.OPENAI_API_KEY,
    jsonMode: true,
    provider: 'openai'  // Uses response_format: { type: "json_object" }
  }
});

Extracting from HTML:

import { extract, loadSchema } from '@ordis-dev/ordis';

const schema = await loadSchema('./schema.json');

// Strip HTML noise before extraction
const result = await extract({
  input: rawHtmlContent,
  schema,
  llmConfig: { baseURL: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
  preprocessing: {
    stripHtml: true  // Removes scripts, styles, nav, ads, etc.
    // Or with options:
    // stripHtml: {
    //   preserveStructure: true,  // Convert headings/lists to markdown
    //   removeSelectors: ['.sidebar', '#comments'],
    //   maxLength: 10000
    // }
  }
});

What Works

  • ✅ Schema loader and validator
  • ✅ Prompt builder with confidence scoring
  • ✅ Universal LLM client (OpenAI-compatible APIs)
  • ✅ Token budget awareness with warnings and errors
  • ✅ Structured error system
  • ✅ CLI extraction command
  • ✅ Programmatic API for library usage
  • ✅ Field-level confidence tracking
  • ✅ TypeScript type definitions
  • ✅ Performance benchmarks
  • ✅ HTML preprocessing for noisy web content

Performance

Pipeline overhead is negligible (~1-2ms). LLM calls dominate execution time (1-10s depending on model). See benchmarks/README.md for detailed metrics.

Run benchmarks:

npm run benchmark

Roadmap

Completed in v0.6.1:

  • ✅ Fixed JSON mode with Ollama /v1 endpoint (#81)
    • Automatic endpoint detection (response_format for /v1, format for /api)
    • Improved documentation with endpoint comparison and recommendations

Completed in v0.6.0:

  • ✅ JSON mode support for OpenAI and Ollama providers (#78)
    • Auto-detection based on base URL
    • Eliminates parsing failures from non-JSON responses
    • Works with both Ollama (format: "json") and OpenAI (response_format)

Completed in v0.5.1:

  • ✅ Default context window increased to 32k (was 4096)
  • ✅ Markdown-wrapped JSON parsing (#74)
  • ✅ AMD GPU support in benchmarks (rocm-smi detection)
  • ✅ GPU health monitoring in benchmarks (VRAM pressure, utilization)

Completed in v0.5.0:

  • ✅ Type coercion for LLM output (#71)
    • Automatic string-to-number/boolean coercion
    • Null-like string handling ("null"/"none"/"n/a")
    • Enum case-insensitive matching ("Series B" → "series_b")
    • Date format normalization (US, EU, written formats)
  • ✅ Array of objects support (#70)
    • Nested object schemas with recursive validation
    • Proper error paths (e.g., "items[1].price")
  • ✅ Ollama runtime options (num_ctx, num_gpu)

Completed in v0.4.0:

  • ✅ User-friendly error messages (#63)
    • Emoji indicators (❌, 💡, ℹ️) for quick scanning
    • Expected vs. actual values for validation errors
    • Actionable suggestions for common issues
    • Service-specific troubleshooting (Ollama, LM Studio, OpenAI)
  • ✅ Debug mode enhancements
    • Full LLM request/response logging
    • Token usage breakdown

See CHANGELOG.md for complete version history.

Contributing

Contributions are welcome!