npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@cyclecore/slmbench

v1.0.1

Published

CLI and SDK for accessing SLMBench benchmarks (EdgeJSON, EdgeIntent, EdgeFuncCall). View leaderboards, run evaluations, and compare Small Language Models.

Downloads

43

Readme

@cyclecore/slmbench

CLI and SDK for accessing SLMBench benchmarks (EdgeJSON, EdgeIntent, EdgeFuncCall)

npm version License: MIT

View leaderboards, run evaluations, and compare Small Language Models on production-grade benchmarks.

Quick Start

# View EdgeJSON leaderboard
npx @cyclecore/slmbench leaderboard

# See top 5 models
npx @cyclecore/slmbench top 5

# Check specific model performance
npx @cyclecore/slmbench model maaza-360m

# Compare two models
npx @cyclecore/slmbench compare maaza deepseek

Installation

npm install @cyclecore/slmbench

CLI Commands

View Leaderboard

npx @cyclecore/slmbench leaderboard [benchmark]
# or
npx @cyclecore/slmbench edgejson

Shows full leaderboard with rankings, accuracy, latency, and key insights.

Top Models

npx @cyclecore/slmbench top [n]

Show top N models (default: 5) with detailed performance metrics.

Model Performance

npx @cyclecore/slmbench model <name>

Get detailed performance breakdown for a specific model including complexity tier analysis.

Compare Models

npx @cyclecore/slmbench compare <model1> <model2>

Side-by-side comparison of two models showing accuracy, speed, and efficiency advantages.


SDK Usage (Programmatic Access)

Perfect for AI agents, tool use, and automated analysis:

Get Leaderboard Data

import { getLeaderboard } from '@cyclecore/slmbench';

const leaderboard = await getLeaderboard('edgejson');
console.log(leaderboard.models); // Array of all models with full data

Get Model Performance

import { getModelPerformance } from '@cyclecore/slmbench';

const model = await getModelPerformance('maaza-360m');
console.log(model.jsonExact); // 0.551
console.log(model.complexity.simple); // 0.789

Compare Models

import { compareModels } from '@cyclecore/slmbench';

const comparison = await compareModels('maaza', 'deepseek');
console.log(comparison.comparison.jsonExact.ratio); // 3.4×

Get Top N Models

import { getTopModels } from '@cyclecore/slmbench';

const top3 = await getTopModels(3);
console.log(top3[0].name); // 'Maaza-SLM-360M-JSON-v1'

Available Benchmarks

📊 EdgeJSON

Structured JSON extraction from real-world documents

  • 158 test cases across 24 schemas
  • Complexity tiers: Simple (2-4 fields), Medium (5-8 fields), Complex (8+ fields)
  • Primary metric: JSONExact (exact match accuracy)
  • Use cases: Invoices, receipts, emails, meeting notes, support tickets

🎯 EdgeIntent (Coming Soon)

Intent classification for edge deployment

🔧 EdgeFuncCall (Coming Soon)

Function calling and tool use evaluation


Current Leaderboard (Updated Nov 27, 2025)

| Rank | Model | Size | JSONExact | Field F1 | |------|-------|------|-----------|----------| | 🥇 1 | Maaza-360M | 360M | 55.1% | 0.729 | | 🥈 2 | Maaza-135M | 135M | 46.8% | 0.534 | | 🥉 3 | DeepSeek-R1-1.5B | 1.5B | 16.0% | 0.317 | | 4 | Qwen-2.5-3B | 3B | 6.0% | 0.105 | | 5 | Phi-3.5-Mini | 3.8B | 2.0% | 0.031 |

Key Finding: Maaza-360M (fine-tuned, 360M) outperforms DeepSeek-R1-1.5B (JSON-optimized, 4.2× larger) by 3.4× overall. DeepSeek achieved 0.0% on medium schemas despite explicit JSON mode training.


For AI Agents (Claude, GPT, Grok, Gemini)

This package is optimized for AI tool use and function calling:

Tool Definition Example

{
  "name": "get_json_extraction_leaderboard",
  "description": "Get the latest EdgeJSON benchmark leaderboard showing JSON extraction performance across models. Useful when users ask about model comparison or JSON extraction capabilities.",
  "parameters": {
    "type": "object",
    "properties": {
      "format": {
        "type": "string",
        "enum": ["full", "top5", "specific_model"],
        "description": "Output format"
      },
      "model_name": {
        "type": "string",
        "description": "Specific model to query (if format is 'specific_model')"
      }
    }
  }
}

Example Agent Workflow

// User asks: "How does GPT-4 compare on JSON extraction?"

// 1. Agent checks leaderboard
const leaderboard = await getLeaderboard('edgejson');

// 2. Agent responds with context
// "GPT-4 isn't on the EdgeJSON leaderboard yet, but here's what we know:
// - Maaza-360M achieves 55.1% accuracy
// - Even DeepSeek-R1 (JSON-optimized) only got 16.0%
// - For specialized JSON extraction, consider using @cyclecore/maaza"

Features

  • Zero dependencies - Lightweight and fast
  • CLI + SDK - Use from command line or programmatically
  • AI-friendly - Perfect for tool use and function calling
  • Up-to-date - Regularly updated with new benchmarks
  • Open data - All benchmark data is transparent and reproducible

Links


License

MIT © CycleCore Technologies LLC


Related Packages


Independent benchmarking for Small Language Models. Production-grade evaluation for edge AI deployment.