npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

transformers-llguidance

v0.2.1

Published

Structured output generation for transformers.js using llguidance

Readme

transformers-llguidance

Structured output generation for transformer.js using llguidance.

This library enables constrained text generation in the browser and Node.js by integrating the high-performance llguidance Rust library with transformer.js via WebAssembly.

Features

  • JSON Schema constraints - Generate valid JSON matching any JSON Schema
  • Regex patterns - Constrain output to match regular expressions
  • Lark grammars - Full CFG support for complex structured output
  • Speculative decoding - Optimized performance with fast-path token validation
  • Zero server dependencies - Runs entirely in browser/Node.js

Installation

npm install transformers-llguidance

Quick Start

import { pipeline } from '@huggingface/transformers';
import {
  GuidanceParser,
  GuidanceLogitsProcessor,
  extractTokenizerData,
} from 'transformers-llguidance';

// Load a model
const generator = await pipeline('text-generation', 'Xenova/gpt2');

// Extract tokenizer data
const tokenizerData = extractTokenizerData(generator.tokenizer);

// Create a parser with JSON schema constraint
const parser = await GuidanceParser.create({
  type: 'json_schema',
  schema: {
    type: 'object',
    properties: {
      name: { type: 'string' },
      age: { type: 'number' }
    },
    required: ['name', 'age']
  }
}, tokenizerData);

// Create logits processor
const processor = new GuidanceLogitsProcessor(parser);

// Generate constrained output
const output = await generator('Generate a person:', {
  max_new_tokens: 50,
  logits_processor: [processor],
});

console.log(output[0].generated_text);
// Output will always be valid JSON matching the schema

Grammar Types

JSON Schema

const grammar = {
  type: 'json_schema',
  schema: {
    type: 'object',
    properties: {
      name: { type: 'string' },
      age: { type: 'integer', minimum: 0 }
    },
    required: ['name', 'age']
  }
};

Regex Pattern

const grammar = {
  type: 'regex',
  pattern: '[a-zA-Z]+@[a-zA-Z]+\\.[a-zA-Z]{2,}'
};

Lark Grammar (CFG)

const grammar = {
  type: 'lark',
  grammar: `
    start: expr
    expr: term (("+"|"-") term)*
    term: NUMBER
    NUMBER: /[0-9]+/
  `,
  startSymbol: 'start'
};

API Reference

GuidanceParser

The core parser that wraps the llguidance WASM module.

class GuidanceParser {
  // Create a new parser instance
  static async create(grammar: Grammar, tokenizer: TokenizerData): Promise<GuidanceParser>;

  // Fast O(1) check if a token is allowed
  isTokenAllowed(tokenId: number): boolean;

  // Get full token mask (slower, use for fallback)
  getTokenMask(): Uint8Array;

  // Advance parser state after token selection
  advance(tokenId: number): void;

  // Check if generation can terminate
  isComplete(): boolean;

  // Reset parser for reuse
  reset(): void;

  // Get vocabulary size
  get vocabSize(): number;
}

GuidanceLogitsProcessor

Logits processor compatible with transformer.js.

class GuidanceLogitsProcessor {
  constructor(parser: GuidanceParser, options?: ProcessorOptions);

  // Process logits (called by transformer.js)
  process(inputIds: number[], logits: Float32Array): Float32Array;

  // Advance state after sampling (call after each token)
  onToken(tokenId: number): void;

  // Check if generation can stop
  canStop(): boolean;

  // Reset for new generation
  reset(): void;
}

interface ProcessorOptions {
  // Number of top tokens to try before full mask (default: 5)
  speculationDepth?: number;

  // Enable debug logging (default: false)
  debug?: boolean;
}

Tokenizer Utilities

// Extract tokenizer data from transformer.js tokenizer
function extractTokenizerData(tokenizer: TransformersTokenizer): TokenizerData;

// Load tokenizer data directly from HuggingFace Hub
async function loadTokenizerData(modelId: string, options?: {
  token?: string;
  baseUrl?: string;
}): Promise<TokenizerData>;

How It Works

  1. Grammar compilation: llguidance compiles your grammar (JSON schema, regex, or Lark) into an efficient state machine
  2. Speculative checking: During generation, we first check if the model's top-k predicted tokens are valid (fast path)
  3. Fallback masking: If no top-k tokens are valid, we compute the full token mask (slower path)
  4. Logit modification: Invalid tokens have their logits set to -∞, ensuring they're never sampled

Generation Loop

  1. Model produces logits
  2. GuidanceLogitsProcessor.process() called
    1. Try top-5 tokens with is_token_allowed()
    2. If hit: mask all except winner
    3. If miss: compute full mask with get_token_mask()
  3. Sample from modified logits
  4. Call processor.onToken() with sampled token
  5. Repeat until processor.canStop() or max tokens

Building from Source

Prerequisites

  • Node.js 18+
  • Rust toolchain with wasm32-unknown-unknown target
  • wasm-pack

Build

# Install dependencies
npm install

# Build WASM module
npm run build:wasm

# Build TypeScript
npm run build

# Run tests
npm test

Performance Tips

  1. Use speculative decoding: The default speculationDepth: 5 works well for most cases. Increase for models with more uncertain predictions.

  2. Reuse parsers: Create the parser once and call reset() between generations instead of creating new instances.

  3. Batch processing: When generating multiple outputs with the same grammar, reuse the same parser instance.

Limitations

  • Currently requires the WASM module to be built from source
  • Some llguidance features may require adjustment for WASM compatibility
  • Large grammars may increase WASM binary size

License

MIT

Acknowledgments