npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@chatman-media/kb

v1.14.1

Published

Tenant-scoped Knowledge Base: hybrid retrieval (pgvector + BM25), ingest, answer pipeline, persona/skill composition. LLM I/O живёт в @chatman-media/llm-router.

Downloads

2,126

Readme

@chatman-media/kb

npm version CI TypeScript Bun License: MIT

Tenant-scoped Knowledge Base for the Lead Engine platform. Provides a full retrieval-augmented generation (RAG) pipeline: hybrid search (pgvector cosine + BM25 with RRF fusion), multi-query expansion, cross-encoder reranking, MMR diversification, dynamic distance threshold, semantic caching, persona/skill composition, photo classification and passport OCR.

LLM I/O lives in @chatman-media/llm-router.

Key modules

| Module | What it does | |---|---| | answer.ts | Full RAG answer pipeline: retrieve → filter → diversify → rerank → generate | | ingest.ts | Document ingest: parse → chunk → embed → upsert | | hybrid-search.ts | pgvector cosine + BM25 keyword fusion via RRF | | retrieval-utils.ts | rrfMerge, applyDynamicThreshold, mmrDiversify — post-retrieval transforms | | multi-query.ts | expandQueries — LLM-generated query variants for parallel search | | reranker.ts | JinaReranker, CohereReranker — cross-encoder second-pass reranking | | rewrite-query.ts | Context-aware query rewriting (resolves pronouns / ellipsis via history) | | semantic-cache.ts | Vector-similarity cache for identical/near-identical questions | | vision.ts | classifyPhoto() + extractPassportIdentity() — passport OCR via vision LLM | | ab-router.ts | A/B experiment allocation for styles/personas | | grade-skills.ts | ELO-based skill grading via judge LLM | | prompt.ts | composeSystemPrompt() — assemble sales persona + KB context + style |

RAG pipeline

Each call to answerWithRag / answerWithRagStream goes through these stages:

1. [opt] Query rewrite     — resolves pronouns, expands ellipsis (rewriteQueryBeforeRetrieval)
2. [opt] Multi-query       — generate N variants → embed all in parallel (multiQuery)
3. Vector / hybrid search  — pgvector cosine or RRF(vector+BM25)
4. [opt] RRF merge         — fuse N result lists if multi-query was used (rrfMerge)
5. [opt] Distance filter   — drop hits > threshold (autoTrimDistance)
6. [opt] MMR diversify     — reduce duplicate chunks (mmr)
7. [opt] Cross-encoder     — reranker.rerank(query, candidates, topK) (reranker)
8. Prompt composition      — style + persona + context + skills + hooks
9. LLM generation          — stream or single response
10.[opt] Fact-checker      — hallucination guard (reflect)

All stages are optional and controlled per-request via AnswerInput fields.

API

answerWithRag(input: AnswerInput): Promise<AnswerResult>

import { answerWithRag, JinaReranker } from "@chatman-media/kb";

const result = await answerWithRag({
  question: "сколько стоит квартира в ЖК Марина?",
  kb,           // IKbStore implementation
  embedder,     // EmbeddingClient
  chat,         // ChatClient

  // ── Retrieval tuning ──────────────────────────────
  topK: 5,                        // final chunks to pass to the LLM
  hybridSearch: true,             // vector + BM25 fusion
  rewriteQueryBeforeRetrieval: true, // resolve "там" / "он" via history

  // Multi-query expansion (generate 2 rephrases, search 3 in parallel)
  multiQuery: true,
  multiQueryCount: 2,

  // Drop chunks with cosine distance > 0.45 (reduce hallucinations)
  autoTrimDistance: true,
  autoTrimThreshold: 0.45,

  // Maximal Marginal Relevance — diversify results
  mmr: true,
  mmrLambda: 0.6,               // 1.0 = pure relevance, 0.0 = pure diversity

  // Cross-encoder reranker (retrieves topK×3 candidates, returns topK)
  reranker: new JinaReranker({ apiKey: process.env.JINA_API_KEY! }),

  // ── Generation ────────────────────────────────────
  history,        // ChatMessage[] — conversation context
  persona,        // Persona — bot identity
  style,          // Style — sales methodology (SPIN / NEPQ / AIDA)
  stage,          // FunnelStage — current funnel stage
  skills,         // SkillForPrompt[] — active persuasion skills
  reflect: true,  // hallucination guard (LLM judge)

  onTelemetry: (t) => console.log(t), // retrieval_ms, top_distances, path, ...
});

console.log(result.text);         // generated reply
console.log(result.hits);         // KbSearchHit[] — chunks used
console.log(result.usedChunkIds); // chunk IDs referenced in the reply

Rerankers

import { JinaReranker, CohereReranker } from "@chatman-media/kb";

// Jina — multilingual, good for Russian (jina-reranker-v2-base-multilingual)
const reranker = new JinaReranker({
  apiKey: process.env.JINA_API_KEY!,
  model: "jina-reranker-v2-base-multilingual", // default
});

// Cohere — also multilingual (rerank-v3.5)
const reranker = new CohereReranker({
  apiKey: process.env.COHERE_API_KEY!,
  model: "rerank-v3.5", // default
});

Post-retrieval utilities

import { rrfMerge, applyDynamicThreshold, mmrDiversify } from "@chatman-media/kb";

// Merge results from multiple queries via Reciprocal Rank Fusion
const merged = rrfMerge([hitsFromQuery1, hitsFromQuery2, hitsFromQuery3], { topN: 15 });

// Drop hits with cosine distance > 0.4 (keep at least 1)
const trimmed = applyDynamicThreshold(hits, { threshold: 0.4, minHits: 1 });

// Maximal Marginal Relevance — diversify, reduce duplicates
const diverse = mmrDiversify(hits, { lambda: 0.6, topK: 5 });

Golden RAG evals

Use evaluateRagGoldenCases when you need an end-to-end regression harness around retrieval, answer grounding, persona/stage expectations and ablations. It runs the normal answerWithRag pipeline and scores each JSONL case against expected sources, required facts, forbidden claims and telemetry path. By default the scorer is deterministic (substring/source matching), so it can run in CI without live LLM calls. For semantic groundedness checks, pass judge: makeRagGoldenLlmJudge({ chat: judgeChat }).

import {
  defaultRagGoldenAblations,
  evaluateRagGoldenCases,
  formatRagGoldenFailures,
  parseRagGoldenJsonl,
} from "@chatman-media/kb";

const cases = parseRagGoldenJsonl(await Bun.file("packages/kb/evals/generic-rag.jsonl").text());

const report = await evaluateRagGoldenCases({
  cases,
  makeInput: (item) => ({
    question: item.question,
    kb,
    embedder,
    chat,
    hybridSearch: true,
    multiQuery: true,
    mmr: true,
  }),
  ablations: defaultRagGoldenAblations(),
});

const failures = formatRagGoldenFailures(report);
if (failures) throw new Error(failures);

For a deterministic local/CI smoke run without external LLM or DB calls:

bun run apps/api/scripts/eval-rag.ts --ablate
bun run apps/api/scripts/eval-rag.ts --vertical=exchange --baseline=rag-baseline.json
bun run apps/api/scripts/eval-rag.ts --update-baseline=rag-baseline.json --json

Vision

import { classifyPhoto, extractPassportIdentity } from "@chatman-media/kb";

const cls = await classifyPhoto({
  bytes: await res.arrayBuffer(),
  model: "gpt-4o",
  apiKey: "sk-...",
  provider: "openai",
});
// cls → "passport" | "full_body" | "portrait" | "other"

if (cls === "passport") {
  const identity = await extractPassportIdentity({ bytes, model, apiKey, provider });
  // identity → { family_name?, given_name?, passport_number?, passport_expiry? }
}

In apps/api, photo classification is wired automatically via photo-processor.ts: when a tenant has a vision LLM config, every incoming photo is classified and passport data is merged into contact.attributes_json.

Install

bun add @chatman-media/kb     # Bun
npm install @chatman-media/kb # npm / pnpm / yarn

Part of the lead-engine monorepo — a multi-tenant SaaS platform for AI sales bots on Telegram / WhatsApp.

License

MIT — Alexander Kireev / chatman-media