npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@ordis-dev/ordis

v0.1.1

Published

Schema-first LLM extraction tool that turns unstructured text into validated structured data

Downloads

191

Readme

Ordis

Ordis is a local-first tool and library that turns messy, unstructured text into clean, structured data using a schema-driven extraction pipeline powered by LLMs. You give it a schema that describes the fields you expect, point it at some raw text, and choose any OpenAI-compatible model. Ordis builds the prompt, calls the model, validates the output, and returns either a correct structured record or a clear error.

Ordis does for LLM extraction what Prisma does for databases: strict schemas, predictable output and no more glue code.

Status

✅ CLI functional - Core extraction pipeline working with real LLMs. Ready for testing and feedback.

✅ Programmatic API - Can be used as an npm package in Node.js applications.

Features

  • Local-first extraction: Supports Ollama, LM Studio, or any OpenAI-compatible endpoint
  • Schema-first workflow: Define your data structure upfront
  • Deterministic output: Returns validated records or structured failures
  • Token budget awareness: Automatic token counting with warnings and limits
  • Dual-purpose: Use as a CLI or import as a library
  • TypeScript support: Full type definitions included

Example

ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base http://localhost:11434/v1 \
  --model llama3.1:8b \
  --debug

Sample schema (invoice.schema.json):

{
  "fields": {
    "invoice_id": { "type": "string" },
    "amount": { "type": "number" },
    "currency": { "type": "string", "enum": ["USD", "SGD", "EUR"] },
    "date": { "type": "string", "format": "date-time", "optional": true }
  }
}

Model Compatibility

Works with any service exposing an OpenAI-compatible API:

  • Ollama
  • LM Studio
  • OpenRouter
  • Mistral
  • Groq
  • OpenAI
  • vLLM servers

Installation

From npm (recommended)

Install globally to use the CLI anywhere:

npm install -g @ordis-dev/ordis
ordis --help

Or install locally in your project:

npm install @ordis-dev/ordis

From Source

git clone https://github.com/ordis-dev/ordis
cd ordis
npm install
npm run build
node dist/cli.js --help

Usage

CLI Usage

Extract data from text using a schema:

ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base http://localhost:11434/v1 \
  --model llama3.1:8b \
  --debug

With API key (for providers like OpenAI, Deepseek, etc.):

ordis extract \
  --schema examples/invoice.schema.json \
  --input examples/invoice.txt \
  --base https://api.deepseek.com/v1 \
  --model deepseek-chat \
  --api-key your-api-key-here

Programmatic Usage

Use ordis as a library in your Node.js application:

import { extract, loadSchema, LLMClient } from '@ordis-dev/ordis';

// Load schema from file
const schema = await loadSchema('./invoice.schema.json');

// Or create schema from object
import { loadSchemaFromObject } from 'ordis-cli';
const schema = loadSchemaFromObject({
  fields: {
    invoice_id: { type: 'string' },
    amount: { type: 'number' },
    currency: { type: 'string', enum: ['USD', 'EUR', 'SGD'] }
  }
});

// Configure LLM
const llmConfig = {
  baseURL: 'http://localhost:11434/v1',
  model: 'llama3.2:3b'
};

// Extract data
const result = await extract({
  input: 'Invoice #INV-2024-0042 for $1,250.00 USD',
  schema,
  llmConfig
});

if (result.success) {
  console.log(result.data);
  // { invoice_id: 'INV-2024-0042', amount: 1250, currency: 'USD' }
  console.log('Confidence:', result.confidence);
} else {
  console.error('Extraction failed:', result.errors);
}

Using LLM Presets:

import { extract, loadSchema, LLMPresets } from '@ordis-dev/ordis';

const schema = await loadSchema('./schema.json');

// Use preset configurations
const result = await extract({
  input: text,
  schema,
  llmConfig: LLMPresets.ollama('llama3.2:3b')
  // Or: LLMPresets.openai(apiKey, 'gpt-4o-mini')
  // Or: LLMPresets.lmStudio('local-model')
});

What Works

  • ✅ Schema loader and validator
  • ✅ Prompt builder with confidence scoring
  • ✅ Universal LLM client (OpenAI-compatible APIs)
  • ✅ Token budget awareness with warnings and errors
  • ✅ Structured error system
  • ✅ CLI extraction command
  • ✅ Programmatic API for library usage
  • ✅ Field-level confidence tracking
  • ✅ TypeScript type definitions
  • ✅ Performance benchmarks

Performance

Pipeline overhead is negligible (~1-2ms). LLM calls dominate execution time (1-10s depending on model). See benchmarks/README.md for detailed metrics.

Run benchmarks:

npm run benchmark

Roadmap

Completed in v0.1.0:

  • ✅ Core extraction pipeline with schema validation
  • ✅ Token budget awareness and management
  • ✅ Confidence scoring for extracted data
  • ✅ Programmatic API for library usage
  • ✅ CLI tool with debug mode
  • ✅ Comprehensive test suite and benchmarks
  • ✅ Support for any OpenAI-compatible API

Upcoming:

  • [ ] Smart input truncation (#40)
  • [ ] Multi-pass extraction for large inputs (#41)
  • [ ] Config file support (#16)
  • [ ] Output formatting options (#14)
  • [ ] Batch extraction (#19)
  • [ ] More example schemas (#13)

Contributing

Contributions are welcome!