npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@llmdata/rubric

v0.1.0

Published

TypeScript/Node.js bindings for Rubric - LLM-based evaluation using weighted rubrics. High-performance Rust core with idiomatic TypeScript API.

Readme


About

This package provides TypeScript/Node.js bindings to the Rubric Rust core library via napi-rs, enabling high-performance LLM evaluation in JavaScript environments. The core evaluation logic is written in Rust for maximum performance, with idiomatic TypeScript bindings for ease of use.

Installation

npm install @llmdata/rubric
yarn add @llmdata/rubric
pnpm add @llmdata/rubric
bun add @llmdata/rubric

Quick Start

  1. Set up environment variables:
export OPENAI_API_KEY=your_api_key_here
# Or any other model API key used in your generate function
  1. Run the example below:
import { Rubric, PerCriterionGrader } from '@llmdata/rubric';
import OpenAI from 'openai';

// Declare custom generate function with any model and inference provider
async function generateWithOpenAI(systemPrompt: string, userPrompt: string): Promise<string> {
  const client = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  });

  const response = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: systemPrompt },
      { role: 'user', content: userPrompt },
    ],
    max_tokens: 400,
    temperature: 0.0,
  });

  return response.choices[0]?.message?.content || '';
}

async function main() {
  // Build rubric
  const rubric = Rubric.fromDict([
    { weight: 10.0, requirement: "States Q4 2023 base margin as 17.2%" },
    { weight: 8.0, requirement: "Explicitly uses Shapley attribution for decomposition" },
    { weight: -15.0, requirement: "Uses total deliveries instead of cash-only deliveries" }
  ]);

  // Select autograder strategy
  const grader = new PerCriterionGrader(
    generateWithOpenAI,
    "This overrides the default grader system prompt"
  );

  // Grade output
  const result = await rubric.grade(
    "Output to evaluate...",
    grader,
    "Input query..."
  );

  console.log(`Score: ${result.score.toFixed(2)}`);  // Score is 0.0-1.0
  
  if (result.report) {
    for (const criterion of result.report) {
      console.log(`  [${criterion.verdict}] ${criterion.requirement}`);
      console.log(`    → ${criterion.reason}`);
    }
  }
}

main().catch(console.error);

Autograder Strategies

PerCriterionGrader

Evaluates each criterion in parallel inference calls.

Scoring Formula:

For each criterion i:

  • If verdict = MET, contribution = wi
  • If verdict = UNMET, contribution = 0

Final score:

score = max(0, min(1, Σ(verdict_i = MET ? w_i : 0) / Σ(max(0, w_i))))

Where:

  • wi = weight of criterion i
  • Denominator = sum of positive weights only
  • Numerator = sum of weights for MET criteria
  • Result clamped to [0, 1]

PerCriterionOneShotGrader (Coming Soon)

Makes 1 inference call that evaluates all criteria together and returns a structured output, unlike PerCriterionGrader which makes n inference calls.

RubricAsJudgeGrader (Coming Soon)

Holistic evaluation where the model returns a final score directly.

API Reference

Rubric

class Rubric {
  constructor(criteria: CriterionInput[]);
  static fromDict(criteria: CriterionInput[]): Rubric;
  static fromJson(json: string): Rubric;
  static fromYaml(yaml: string): Rubric;
  static fromFile(path: string): Rubric;
  
  len(): number;
  isEmpty(): boolean;
  
  grade(
    toGrade: string,
    grader?: PerCriterionGrader,
    query?: string
  ): Promise<EvaluationReport>;
}

PerCriterionGrader

class PerCriterionGrader {
  constructor(
    generateFn?: GenerateFunction,
    systemPrompt?: string
  );
}

Types

type CriterionInput = {
  weight: number;
  requirement: string;
};

type CriterionReport = {
  weight: number;
  requirement: string;
  verdict: "MET" | "UNMET";
  reason: string;
};

type EvaluationReport = {
  score: number;
  report?: CriterionReport[];
};

type GenerateFunction = (
  systemPrompt: string,
  userPrompt: string
) => Promise<string> | string;

Loading Rubrics

// Direct construction
const rubric = new Rubric([
  { weight: 10.0, requirement: "States Q4 2023 base margin as 17.2%" },
  { weight: 8.0, requirement: "Explicitly uses Shapley attribution for decomposition" },
  { weight: -15.0, requirement: "Uses total deliveries instead of cash-only deliveries" }
]);

// From array of objects
const rubric = Rubric.fromDict([
  { weight: 10.0, requirement: "States Q4 2023 base margin as 17.2%" },
  { weight: 8.0, requirement: "Explicitly uses Shapley attribution for decomposition" }
]);

// From JSON string
const rubric = rubricFromJson('[{"weight": 10.0, "requirement": "Example requirement"}]');

// From YAML string
const yamlData = `
- weight: 10.0
  requirement: "Example requirement"
`;
const rubric = rubricFromYaml(yamlData);

// From files
const rubric = rubricFromFile('rubric.json');
const rubric = rubricFromFile('rubric.yaml');

JSON Format

[
  {
    "weight": 10.0,
    "requirement": "States Q4 2023 base margin as 17.2%"
  },
  {
    "weight": 8.0,
    "requirement": "Explicitly uses Shapley attribution for decomposition"
  },
  {
    "weight": -15.0,
    "requirement": "Uses total deliveries instead of cash-only deliveries"
  }
]

YAML Format

- weight: 10.0
  requirement: "States Q4 2023 base margin as 17.2%"
- weight: 8.0
  requirement: "Explicitly uses Shapley attribution for decomposition"
- weight: -15.0
  requirement: "Uses total deliveries instead of cash-only deliveries"

Examples with Different Providers

OpenAI

import OpenAI from 'openai';

async function generateWithOpenAI(systemPrompt: string, userPrompt: string): Promise<string> {
  const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
  const response = await client.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      { role: 'system', content: systemPrompt },
      { role: 'user', content: userPrompt },
    ],
    max_tokens: 400,
    temperature: 0.0,
  });
  return response.choices[0]?.message?.content || '';
}

Anthropic

import Anthropic from '@anthropic-ai/sdk';

async function generateWithAnthropic(systemPrompt: string, userPrompt: string): Promise<string> {
  const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
  const response = await client.messages.create({
    model: 'claude-3-5-sonnet-20241022',
    max_tokens: 400,
    system: systemPrompt,
    messages: [{ role: 'user', content: userPrompt }],
  });
  return response.content[0].type === 'text' ? response.content[0].text : '';
}

OpenRouter

import OpenAI from 'openai';

async function generateWithOpenRouter(systemPrompt: string, userPrompt: string): Promise<string> {
  const client = new OpenAI({
    baseURL: 'https://openrouter.ai/api/v1',
    apiKey: process.env.OPENROUTER_API_KEY,
  });
  const response = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      { role: 'system', content: systemPrompt },
      { role: 'user', content: userPrompt },
    ],
  });
  return response.choices[0]?.message?.content || '';
}

Local Models (Ollama)

import { Ollama } from 'ollama';

async function generateWithOllama(systemPrompt: string, userPrompt: string): Promise<string> {
  const ollama = new Ollama();
  const response = await ollama.chat({
    model: 'llama3.1',
    messages: [
      { role: 'system', content: systemPrompt },
      { role: 'user', content: userPrompt },
    ],
  });
  return response.message.content;
}

Requirements

  • Node.js 16+
  • TypeScript 5.0+ (optional, for TypeScript users)
  • An LLM API (e.g., OpenAI, Anthropic, OpenRouter, local models)

Platform Support

Pre-built binaries are available for:

  • macOS: x64, ARM64 (Apple Silicon)
  • Linux: x64, ARM64 (glibc and musl)
  • Windows: x64, ARM64

If a pre-built binary is not available for your platform, the package will compile from source during installation (requires Rust toolchain).

Building from Source

If you need to build from source:

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Clone the repository
git clone https://github.com/The-LLM-Data-Company/rubric.git
cd rubric/bindings/node

# Install dependencies and build
npm install
npm run build

# Run tests
npm test

For detailed publishing instructions, see NPM_PUBLISHING.md.

Performance

The Rust core provides significant performance benefits:

  • Fast evaluation: Native Rust performance for rubric scoring
  • Memory efficient: Minimal memory overhead compared to pure JavaScript
  • Concurrent grading: Efficient parallel processing of multiple criteria
  • Type safety: TypeScript definitions provide full type safety

Contributing

Contributions are welcome! Please see the main repository for contribution guidelines.

License

MIT License - see LICENSE file for details.

Related Projects

Support