npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

expressible

v0.1.3

Published

Train small, local text classifiers from labeled examples. No API keys, no cloud, no GPU.

Downloads

409

Readme

Expressible CLI

Train small, local text classifiers from labeled examples. No API keys, no cloud, no Python, no GPU. Your data never leaves your machine.

Open source (Apache 2.0) · Built by Expressible AI


Distill

Expressible Distill demo

$ expressible distill run "The Vendor shall indemnify and hold harmless the Client
  from all claims, damages, and expenses arising from breach of this Agreement"

{
  "output": "indemnification",
  "confidence": 0.96
}

Classified locally. No API call. Model is ~230KB on disk.

No cloud. No API keys. No external calls. No ML expertise required.


Get to this in 5 minutes

npm install -g expressible

1. Create a project

expressible distill init clause-detector
cd clause-detector

2. Add labeled examples

Provide contract clauses labeled by type — from your own review history, from past LLM outputs, or hand-labeled by your legal team:

# Import from a JSON file
expressible distill add --file ./labeled-clauses.json

# Bulk import from a directory of labeled pairs
expressible distill add --dir ./labeled-clauses/

# Or add one at a time
expressible distill add
# Paste: "For 24 months after closing, Seller shall not own or operate
#          any business competitive with the Business in the Territory."
# Label: non-compete

The JSON file should contain an array of { "input": "...", "output": "..." } objects.

You need at least 10 labeled examples. 50+ gives strong results.

3. Train

expressible distill train
Training classify model "clause-detector"
ℹ 87 training samples loaded
ℹ Categories: non-compete, indemnification, limitation-of-liability,
              change-of-control, termination-for-convenience
ℹ Training classifier with 78 samples, validating on 9...
ℹ Early stopping at epoch 34

Training Complete
  Samples              87
  Validation accuracy  93.2%
  Time elapsed         2.8s

✓ Model saved to model/

4. Run

expressible distill run "Either party may terminate this Agreement at any time
  for any reason by providing 90 days written notice"
{
  "output": "termination-for-convenience",
  "confidence": 0.94
}

That ran locally. No network call. The contract text never left your machine.

5. Review and improve

expressible distill review

Opens a local web UI where you review predictions, correct mistakes, and approve results. Then:

expressible distill retrain
# → Previous accuracy: 89% → New accuracy: 94% (improved by 5%)

6. Export for production

expressible distill export ./deploy/
# Generates standalone inference.js + model files
# No expressible CLI needed — just Node.js and the model

What Stays on Your Machine

Everything.

  • Training data never leaves your filesystem
  • The embedding model runs locally — no API calls, ever
  • Trained models are files you own and control
  • The review UI runs on localhost
  • Zero telemetry. Zero analytics. Zero phone-home.

Works in environments where data can't leave the network — healthcare, financial services, legal, government, or any organization with data residency requirements.


Use Cases

Legal document review — Classify contract clauses by type across thousands of agreements. Privileged documents stay within your perimeter.

Log analysis and alerting — Classify application logs as normal, warning, error, security event, or performance degradation. Thousands per hour, entirely local.

Content moderation — Classify user-generated content against your community guidelines. Consistent categories, high volume.

Support ticket routing — Route incoming tickets to the right team by category. Same classification task, repeated thousands of times.


Benchmarks

With 50 labeled examples (~30 minutes of work), no API keys, and no ML expertise:

| Scenario | Accuracy | Data Source | |---|---|---| | Support ticket routing (4 categories) | 95.0% | Synthetic | | Content moderation (3 categories) | 90.0% | Synthetic | | News categorization (5 categories) | 88.0% | Synthetic | | 20 Newsgroups (5 categories) | 80.0% | Public dataset | | AG News (4 categories) | 64.0% | Public dataset |

Reproduce these results:

npx tsx tests/harness/run.ts

Known limitation: Distill struggles with sentiment and tone classification (44–50% accuracy). The embedding model captures what text is about, not how it evaluates. "Amazing camera" and "terrible camera" produce nearly identical vectors. Details in benchmarks.

AG News improves to 80% with 100 training samples. More data helps — see benchmarks for scaling details. Public dataset results use real-world text from established ML benchmarks — 50 samples drawn from datasets containing 120,000+ entries.

Accuracy improves as you add more examples through the review-retrain loop. Full results, methodology, and known limitations: docs/benchmarks.md


Task Type

Distill trains classification models: text in, one of N categories out.

| Type | Input → Output | Example | |------|---------------|---------| | classify | Text → one of N categories | Contract clause → indemnification |

Commands

expressible distill init <name>     Create a new project
expressible distill add             Add training examples (interactive, file, or bulk)
expressible distill train           Train a model from your samples
expressible distill run <input>     Run inference on text or files
expressible distill review          Open web UI to review and correct predictions
expressible distill retrain         Retrain using review feedback
expressible distill stats           Show project statistics
expressible distill export <dir>    Export model for standalone use
expressible distill doctor          Check system requirements and project health
expressible distill setup           Pre-download embedding model for offline use

What You Need

  • Node.js 18+
  • ~200MB disk space (embedding model + trained model)
  • No Python, no GPU, no Docker

FAQ

Is this a replacement for LLMs? No. This replaces the repetitive, pattern-based subset of LLM calls — the ones where the same prompt structure processes different data every time. For tasks that require reasoning, creativity, or open-ended generation, you still want an LLM.

How many examples do I need? 10 minimum. In practice, 50–100 examples with good coverage of your categories will give you strong results.

How accurate is it? For well-defined classification tasks with clear categories and 50+ examples, 85–95% accuracy is typical. The review → retrain loop lets you improve iteratively.

Does it work with non-English text? The underlying embedding model supports 100+ languages but is strongest in English. Performance varies by language. Test with your data.

Can I use this in CI/CD? Yes. expressible distill run accepts file paths, globs, and piped stdin. Outputs JSON to stdout.

Where does the language understanding come from? Distill uses a pre-trained sentence embedding model (all-MiniLM-L6-v2) that runs locally. It converts text into numerical vectors that capture meaning. A small neural network trained on your examples learns to map those vectors to your labels. The embedding model downloads once (~80MB) and is cached. Everything after that is offline.


Project Structure

my-task/
  distill.config.json        # Task configuration
  samples/                   # Training data (input/output pairs)
  model/                     # Trained model files
  validation/                # Review results
  .distill/                  # Embedding cache

The CLI is fully standalone and open source. Expressible AI offers additional tooling for teams that need governance and managed deployment.

Contributing

See CONTRIBUTING.md for guidelines.

License

Apache 2.0 — See LICENSE

Copyright 2026 Expressible AI, Inc.