npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ongravy/agent-kit

v0.1.0

Published

Production-grade primitives for building agentic AI systems on the Anthropic SDK: hybrid RAG with RRF, eval metrics (recall@k, NDCG, tool-call F1), cost-aware model router, and prompt-cache planning. Extracted from OnGravy.

Downloads

116

Readme

@ongravy/agent-kit

Production-grade primitives for building agentic AI systems on the Anthropic SDK. Extracted from OnGravy — an AI-native accounting platform shipping June 2026.

npm tests bundle license

What this is

A small, production-tested library of the boring-but-critical pieces every agentic LLM system needs:

  • 🔀 Reciprocal Rank Fusion — for hybrid retrieval (combine vector + BM25 + any other ranking)
  • 📊 Eval metrics — recall@k, NDCG, MRR, tool-call F1; the metrics you wish you'd built before that 1am production fire
  • 💰 Cost-aware model router — Haiku → Sonnet → Opus by complexity, with budget downgrade
  • 💾 Prompt-cache planner — decides which blocks should carry cache_control for Anthropic's 90%-savings caching
  • 🔌 MCP wire-format converter — turns your Zod-shaped tool registry into Model Context Protocol tool spec, ready for Claude Desktop / Cursor / Zed

Each module is pure (no I/O, no network) and zero-runtime-deps (only zod as peer-dep). Bundle adds ~12 KB gzipped to your build.

Why these specific helpers

These are the things you think are simple in your demo project, then realise are subtle when you ship:

| Helper | What goes wrong without it | |---|---| | reciprocalRankFusion | Vector retrieval misses queries with rare proper nouns; BM25 misses queries with paraphrases. Single-stage → ~20% recall loss vs hybrid. | | routeModel | Naive setups always use Sonnet → 5× higher LLM bill. Naive routing puts everything on Haiku → reasoning fails on hard queries. | | planPromptCache | Anthropic's prompt cache requires ≥1024 tokens per cached block. Setups that cache below this waste cache lookup latency. | | evalRetrievalSet | "Does this prompt change improve quality?" — without recall@k metrics, your only answer is "vibes." | | toMCPTool + sanitiseMcpName | MCP names can't contain dots/colons/spaces. Naive registries break on tax.compute_gst. |

Installation

npm install @ongravy/agent-kit
# or
pnpm add @ongravy/agent-kit
# or
bun add @ongravy/agent-kit

Peer dep:

npm install zod  # ^4.0.0

Quick examples

Hybrid retrieval (vector + BM25)

import { reciprocalRankFusion } from '@ongravy/agent-kit/rrf';

// Fetch ranked results from each source however you like
const vectorResults = await pgvector.search(queryEmbedding, { k: 30 });
const bm25Results   = await pgTsvector.search(queryText,    { k: 30 });

// Fuse them
const fused = reciprocalRankFusion({
  rankings: [
    { label: 'vec',  items: vectorResults.map(r => ({ id: r.id, score: r.cosine })) },
    { label: 'bm25', items: bm25Results.map(r => ({ id: r.id, score: r.tsRank })) },
  ],
  // Optional: weight one source higher when its precision is better
  weights: { vec: 1.0, bm25: 0.7 },
});

// Top 5 fused results
const topFive = fused.slice(0, 5);

Cost-aware model routing

import { routeModel } from '@ongravy/agent-kit/router';

const decision = routeModel({
  inputTokens:           4_000,
  toolCount:             5,
  expectedTurns:         3,
  isHighStakes:          true,        // money/compliance answer
  remainingBudgetPaise:  1_50_000,    // ₹1500 left this month
});

// decision.model:                'claude-sonnet-4-6'
// decision.reason:               'Multi-turn / multi-tool / high-stakes'
// decision.budgetDowngraded:     false
// decision.estimatedInputCostPaise: 1200

const response = await anthropic.messages.create({
  model: decision.model,
  // …
});

Prompt caching

import { planPromptCache } from '@ongravy/agent-kit/router';

const plan = planPromptCache({
  systemTokens:           2_500,
  toolsTokens:            4_000,
  expectStableForFiveMin: true,
});

// plan.cacheSystem: true
// plan.cacheTools:  true
// plan.estimatedSavingsPctOnNextCall: 81

// Then on the API call:
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-6',
  system: [
    { type: 'text', text: SYSTEM_PROMPT,
      ...(plan.cacheSystem && { cache_control: { type: 'ephemeral' } }) },
  ],
  tools: tools.map((t, i) => ({
    ...t,
    ...(plan.cacheTools && i === tools.length - 1 && { cache_control: { type: 'ephemeral' } }),
  })),
  // …
});

Retrieval evaluation

import { evalRetrievalSet } from '@ongravy/agent-kit/eval';

const summary = evalRetrievalSet([
  {
    queryId:      'q1',
    query:        'GST rate on legal services',
    relevantIds:  [{ id: 'doc-A', grade: 3 }, { id: 'doc-B', grade: 1 }],
    retrievedIds: ['doc-A', 'doc-X', 'doc-B', 'doc-Y'],
  },
  // … 30 more cases
]);

console.log(`recall@5 = ${summary.meanRecallAtK[5].toFixed(3)}`);
console.log(`NDCG@5   = ${summary.meanNdcgAtK[5].toFixed(3)}`);
console.log(`MRR      = ${summary.meanMrr.toFixed(3)}`);

Tool-call accuracy

import { evalToolCallSet } from '@ongravy/agent-kit/eval';

const summary = evalToolCallSet([
  {
    caseId:        'tc1',
    expectedTool:  'create_invoice',
    expectedArgs:  { amount: 1000, party: 'Acme' },
    actualTool:    'create_invoice',
    actualArgs:    { amount: 1000, party: 'Acme', date: '2026-04-01' },
  },
  // …
]);

console.log(`tool accuracy:        ${(summary.toolAccuracy*100).toFixed(1)}%`);
console.log(`fully correct:         ${(summary.fullyCorrectAccuracy*100).toFixed(1)}%`);
console.log(`mean arg-match score: ${(summary.meanArgMatchScore*100).toFixed(1)}%`);

MCP tool format conversion

import { z } from 'zod';
import { buildMCPCatalog, type GenericToolDef } from '@ongravy/agent-kit/mcp';

// Your existing tool registry — any shape with name, description, Zod schema
const myTools: GenericToolDef[] = [
  {
    name:          'tax.compute_gst',
    description:   'Compute GST on a sale.',
    inputSchema:   z.object({ amount: z.number(), rate: z.number() }),
    jurisdictions: ['IN'],
  },
  {
    name:          'sa_zatca.compile',
    description:   'Compile a ZATCA VAT return.',
    inputSchema:   z.object({ taxPeriod: z.string() }),
    jurisdictions: ['SA'],
  },
];

// Filter by jurisdiction + convert to MCP wire format
const catalog = buildMCPCatalog(myTools, { jurisdiction: 'IN' });
// catalog.tools[0].name === 'tax_compute_gst'   (sanitised — dots replaced)
// catalog.tools[0].inputSchema is JSON Schema
// catalog only contains the IN tool (SA filtered out)

Why this works

These helpers come from a real production system that:

  • Handles real money (Indian SMB accounting, GST compliance, audit reports)
  • Runs at multi-tenant scale across 4 jurisdictions (India, UAE, Saudi Arabia, Singapore)
  • Has a measured 0.3% hallucination rate post-defences (vs ~12% bare model)
  • Has a per-business cost cap of ₹500/month enforced via the model router

If you're building anything in the same ballpark — domain-specific Q&A, agentic workflow automation, multi-tenant LLM systems — these primitives have already paid for themselves once.

Read the longform writeup: Six-layer hallucination defence.

API reference

rrf module

| Export | Purpose | |---|---| | reciprocalRankFusion(input) | Fuse N ranked lists into one ranking | | RankedItem, RrfInput, FusedItem | Type signatures |

eval module

| Export | Purpose | |---|---| | evalRetrievalCase(case, ks?) | Single-case recall@k / NDCG / MRR | | evalRetrievalSet(cases, ks?) | Aggregate across cases | | evalToolCallCase(case) | Per-case tool-correct + arg-match | | evalToolCallSet(cases) | Aggregate tool-call summary |

router module

| Export | Purpose | |---|---| | routeModel(ctx) | Pick Haiku / Sonnet / Opus per workload + budget | | planPromptCache(input) | Plan cache_control placement for stable prefixes | | AnthropicModel, RoutingDecision, CachePlan | Type signatures |

mcp module

| Export | Purpose | |---|---| | toMCPTool(def) | Convert internal tool def → MCP wire format | | buildMCPCatalog(defs, opts?) | Build full tools/list response | | filterToolsForMCP(defs, opts?) | Filter by jurisdiction | | sanitiseMcpName(name) | Make any name MCP-safe |

Versioning

This package follows semver. Pre-1.0 releases (0.x.y) may have breaking changes; breaking changes after 1.0.0 will bump the major version.

Contributing

Issues + PRs welcome at github.com/pratikrevankar/ongravy. The pure-test discipline of the parent repo applies — every helper has a corresponding test file. See tests/lib-pure/ for the test pattern.

Author

Pratik Revankar — builder of OnGravy. @pratikrevankar on X.

License

MIT — see LICENSE.