npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

bayesian-bm25

v0.5.0

Published

Bayesian probability transforms for BM25 retrieval scores

Downloads

655

Readme

Bayesian BM25 for JavaScript/TypeScript

A probabilistic framework that converts raw BM25 retrieval scores into calibrated relevance probabilities using Bayesian inference. TypeScript port of bayesian-bm25.

Overview

Standard BM25 produces unbounded scores that lack consistent meaning across queries, making threshold-based filtering and multi-signal fusion unreliable. Bayesian BM25 addresses this by applying a sigmoid likelihood model with a composite prior (term frequency + document length normalization) and computing Bayesian posteriors that output well-calibrated probabilities in [0, 1]. A corpus-level base rate prior further improves calibration without requiring relevance labels.

Key capabilities:

  • Score-to-probability transform -- convert raw BM25 scores into calibrated relevance probabilities via sigmoid likelihood + composite prior + Bayesian posterior
  • Base rate calibration -- corpus-level base rate prior estimated from score distribution (95th percentile, mixture model, or elbow detection) decomposes the posterior into three additive log-odds terms
  • Parameter learning -- batch gradient descent or online SGD with EMA-smoothed gradients and Polyak averaging, with three training modes: balanced (C1), prior-aware (C2), and prior-free (C3)
  • Probabilistic fusion -- combine multiple probability signals using AND, OR, NOT, and log-odds conjunction with multiplicative confidence scaling, optional per-signal reliability weights (Log-OP), and sparse signal gating (ReLU/Swish activations from Paper 2, Theorems 6.5.3/6.7.4)
  • Learnable fusion weights -- LearnableLogOddsWeights learns per-signal reliability from labeled data via a Hebbian gradient that is backprop-free, starting from Naive Bayes uniform initialization (Remark 5.3.2)
  • Attention-based fusion -- AttentionLogOddsWeights learns query-dependent signal weights via attention mechanism (Paper 2, Section 8), replacing static weights with query-adaptive weighting
  • Hybrid search -- cosineToProbability() converts vector similarity scores to probabilities for fusion with BM25 signals via weighted log-odds conjunction
  • Balanced fusion -- balancedLogOddsFusion() min-max normalizes sparse and dense logits to equalize voting power before combining, preventing heavy-tailed BM25 logits from drowning the dense signal
  • WAND pruning -- wandUpperBound() computes safe Bayesian probability upper bounds for document pruning in top-k retrieval
  • Calibration metrics -- expectedCalibrationError(), brierScore(), reliabilityDiagram(), and calibrationReport() for evaluating probability quality, with CalibrationReport bundling all metrics into a single diagnostic
  • Fusion debugging -- FusionDebugger records every intermediate value through the full pipeline (likelihood, prior, posterior, fusion) for transparent inspection, document comparison, and crossover detection; supports hierarchical fusion tracing with AND/OR/NOT composition
  • Multi-field search -- MultiFieldScorer maintains separate BM25 indexes per field and fuses field-level probabilities via log-odds conjunction with configurable per-field weights
  • Search integration -- built-in BM25 scorer that returns probabilities instead of raw scores, with support for Robertson, Lucene, and ATIRE variants

Installation

npm install bayesian-bm25

Quick Start

Converting BM25 Scores to Probabilities

import { BayesianProbabilityTransform } from "bayesian-bm25";

const transform = new BayesianProbabilityTransform(1.5, 1.0, 0.01);

const scores = [0.5, 1.0, 1.5, 2.0, 3.0];
const tfs = [1, 2, 3, 5, 8];
const docLenRatios = [0.3, 0.5, 0.8, 1.0, 1.5];

const probabilities = transform.scoreToProbability(scores, tfs, docLenRatios);

End-to-End Search with Probabilities

import { BayesianBM25Scorer } from "bayesian-bm25";

const corpusTokens = [
  ["python", "machine", "learning"],
  ["deep", "learning", "neural", "networks"],
  ["data", "visualization", "tools"],
];

const scorer = new BayesianBM25Scorer({
  k1: 1.2,
  b: 0.75,
  method: "lucene",
  baseRate: "auto",
});
scorer.index(corpusTokens);

const { docIds, probabilities } = scorer.retrieve(
  [["machine", "learning"]],
  3,
);

Multi-Field Search

import { MultiFieldScorer } from "bayesian-bm25";

const documents = [
  { title: ["bayesian", "bm25"], body: ["probabilistic", "framework", "search"] },
  { title: ["neural", "networks"], body: ["deep", "learning", "models"] },
  { title: ["information", "retrieval"], body: ["search", "ranking", "relevance"] },
];

const scorer = new MultiFieldScorer({
  fields: ["title", "body"],
  fieldWeights: { title: 0.4, body: 0.6 },
  k1: 1.2,
  b: 0.75,
  method: "lucene",
});
scorer.index(documents);
const { docIds, probabilities } = scorer.retrieve(["bayesian", "search"], 3);

Combining Multiple Signals

import { logOddsConjunction, probAnd, probNot } from "bayesian-bm25";

const signals = [0.85, 0.70, 0.60];

probAnd(signals);              // 0.357 (shrinkage problem)
logOddsConjunction(signals);   // 0.773 (agreement-aware)

// Exclusion query: "python AND NOT java"
const pPython = 0.90;
const pJava = 0.75;
probAnd([pPython, probNot(pJava)]);  // 0.225

Hybrid Text + Vector Search

import { cosineToProbability, logOddsConjunction } from "bayesian-bm25";

// BM25 probabilities (from Bayesian BM25)
const bm25Probs = [0.85, 0.60, 0.40];

// Vector search cosine similarities -> probabilities
const cosineScores = [0.92, 0.35, 0.70];
const vectorProbs = cosineToProbability(cosineScores);  // [0.96, 0.675, 0.85]

// Fuse with reliability weights (BM25 weight=0.6, vector weight=0.4)
const stacked = bm25Probs.map((bp, i) => [bp, vectorProbs[i]!]);
const fused = logOddsConjunction(stacked, undefined, [0.6, 0.4]);

// Fuse with weights and confidence scaling (alpha + weights compose)
const fusedScaled = logOddsConjunction(stacked, 0.5, [0.6, 0.4]);

// Gated fusion: ReLU/Swish activation in logit space (Paper 2, Theorems 6.5.3/6.7.4)
const fusedRelu = logOddsConjunction(stacked, undefined, undefined, "relu");   // MAP estimation
const fusedSwish = logOddsConjunction(stacked, undefined, undefined, "swish"); // Bayes estimation

Balanced Sparse-Dense Fusion

import { balancedLogOddsFusion } from "bayesian-bm25";

// BM25 probabilities (from Bayesian BM25)
const bm25Probs = [0.85, 0.60, 0.40];

// Dense cosine similarities (from vector search)
const cosineScores = [0.92, 0.35, 0.70];

// Balanced fusion: normalizes logits before combining
const fused = balancedLogOddsFusion(bm25Probs, cosineScores);

// Asymmetric weighting (0.7 = sparse weight, 0.3 implicit dense weight)
const fusedWeighted = balancedLogOddsFusion(bm25Probs, cosineScores, 0.7);

Learnable Fusion Weights

import { LearnableLogOddsWeights, logOddsConjunction } from "bayesian-bm25";

// 3-signal hybrid system: BM25, vector, metadata
const learner = new LearnableLogOddsWeights(3);

// Batch training on labeled data
const signalsBatch = [
  [0.9, 0.8, 0.3],  // doc 1: BM25 and vector agree
  [0.2, 0.7, 0.6],  // doc 2: vector is more reliable
  [0.8, 0.3, 0.9],  // doc 3: BM25 and metadata agree
];
const labels = [1, 1, 1];
learner.fit(signalsBatch, labels);

console.log(learner.weights);          // learned reliability weights
console.log(learner.averagedWeights);  // Polyak-averaged (smoother)

// Combine signals using learned weights
const fused = learner.combine([0.85, 0.70, 0.50]);

// Online update from live feedback
learner.update([0.75, 0.60, 0.40], 1.0, { learningRate: 0.01 });

Attention-Based Fusion

import { AttentionLogOddsWeights } from "bayesian-bm25";

// 2 retrieval signals, 3 query features
const attn = new AttentionLogOddsWeights(2, 3, 0.5);

// Train on labeled data with query features
// trainingProbs: number[][] (m x 2), labels: number[] (m), queryFeatures: number[][] (m x 3)
attn.fit(trainingProbs, trainingLabels, queryFeatures, {
  learningRate: 0.01,
  maxIterations: 500,
});

// Query-dependent fusion: weights adapt per query
const fused = attn.combine(testProbs, testFeatures, true);

Debugging Fusion Decisions

import { FusionDebugger, BayesianProbabilityTransform } from "bayesian-bm25";

const transform = new BayesianProbabilityTransform(1.5, 1.0, 0.01);
const debugger_ = new FusionDebugger(transform);

// Trace a BM25 score through the full probability pipeline
const bm25Trace = debugger_.traceBM25(2.5, 3, 0.8);
console.log(debugger_.formatTrace(bm25Trace));
// => "BM25  score=2.500  L=0.818  prior=0.650  post=0.790"

// Full document trace (BM25 + vector + fusion)
const docTrace = debugger_.traceDocument(2.5, 3, 0.8, 0.85);
console.log(debugger_.formatSummary(docTrace));

// Compare two documents to explain rank differences
const docA = debugger_.traceDocument(3.0, 5, 0.9, 0.80);
const docB = debugger_.traceDocument(1.5, 2, 1.2, 0.95);
const comparison = debugger_.compare(docA, docB);
console.log(debugger_.formatComparison(comparison));

WAND Pruning with Bayesian Upper Bounds

import { BayesianProbabilityTransform } from "bayesian-bm25";

const transform = new BayesianProbabilityTransform(1.5, 2.0, 0.01);

// Standard BM25 upper bound per query term
const bm25UpperBound = 5.0;

// Bayesian upper bound for safe pruning -- any document's actual
// probability is guaranteed to be at most this value
const bayesianBound = transform.wandUpperBound(bm25UpperBound);

Evaluating Calibration Quality

import {
  expectedCalibrationError,
  brierScore,
  reliabilityDiagram,
  calibrationReport,
} from "bayesian-bm25";

const probabilities = [0.9, 0.8, 0.3, 0.1, 0.7, 0.2];
const labels = [1.0, 1.0, 0.0, 0.0, 1.0, 0.0];

const ece = expectedCalibrationError(probabilities, labels); // lower is better
const bs = brierScore(probabilities, labels);                // lower is better
const bins = reliabilityDiagram(probabilities, labels, 5);   // [avgPred, avgActual, count]

// One-call diagnostic report
const report = calibrationReport(probabilities, labels);
console.log(report.summary());  // formatted text with ECE, Brier, and reliability table

Online Learning from User Feedback

import { BayesianProbabilityTransform } from "bayesian-bm25";

const transform = new BayesianProbabilityTransform(1.0, 0.0);

// Batch warmup on historical data
transform.fit(historicalScores, historicalLabels);

// Online refinement from live feedback
for (const { score, label } of feedbackStream) {
  transform.update(score, label, { learningRate: 0.01, momentum: 0.95 });
}

// Use Polyak-averaged parameters for stable inference
const alpha = transform.averagedAlpha;
const beta = transform.averagedBeta;

Training Modes

import { BayesianProbabilityTransform } from "bayesian-bm25";

const transform = new BayesianProbabilityTransform(1.0, 0.0);

// C1 (balanced, default): train on sigmoid likelihood
transform.fit(scores, labels, { mode: "balanced" });

// C2 (prior-aware): train on full Bayesian posterior
transform.fit(scores, labels, { mode: "prior_aware", tfs, docLenRatios });

// C3 (prior-free): train on likelihood, inference uses prior=0.5
transform.fit(scores, labels, { mode: "prior_free" });

API

BayesianProbabilityTransform

Core class for converting BM25 scores to calibrated probabilities.

new BayesianProbabilityTransform(alpha?, beta?, baseRate?)

| Method | Description | |---|---| | likelihood(score) | Sigmoid likelihood (Eq. 20) | | scoreToProbability(score, tf, docLenRatio) | Full pipeline: BM25 score to calibrated probability | | wandUpperBound(bm25UpperBound) | Bayesian WAND upper bound for safe pruning (Theorem 6.1.2) | | fit(scores, labels, options?) | Batch gradient descent with training mode support | | update(score, label, options?) | Online SGD with EMA gradients and Polyak averaging |

Static methods: tfPrior(tf), normPrior(docLenRatio), compositePrior(tf, docLenRatio), posterior(likelihood, prior, baseRate?)

All methods accept both scalar (number) and array (number[]) inputs.

FitOptions: learningRate, maxIterations, tolerance, mode ("balanced" | "prior_aware" | "prior_free"), tfs, docLenRatios

UpdateOptions: learningRate, momentum, decayTau, maxGradNorm, avgDecay, mode, tf, docLenRatio

BayesianBM25Scorer

Integrated BM25 search with Bayesian probability output.

new BayesianBM25Scorer({ k1?, b?, method?, alpha?, beta?, baseRate?, baseRateMethod? })

| Method | Description | |---|---| | index(corpusTokens) | Build BM25 index and auto-estimate parameters | | retrieve(queryTokens, k?, explain?) | Top-k retrieval with calibrated probabilities; explain=true returns RetrievalResult with per-document BM25SignalTrace explanations | | getProbabilities(queryTokens) | Dense probability array for all documents | | addDocuments(newCorpusTokens) | Append documents and rebuild index (IDF recomputation) |

Properties: numDocs, docLengths, avgdl, baseRate

The baseRate option accepts null (default, no correction), "auto" (estimated from corpus), or a number in (0, 1).

The baseRateMethod option controls how "auto" base rate is estimated: "percentile" (default, 95th percentile heuristic), "mixture" (2-component Gaussian EM), or "elbow" (knee point detection in sorted score curve).

Fusion Functions

| Function | Description | |---|---| | cosineToProbability(score) | Convert cosine similarity [-1, 1] to probability (Definition 7.1.2) | | probNot(prob) | Probabilistic NOT via complement rule: 1 - P(R) (Eq. 35) | | probAnd(probs) | Probabilistic AND via product rule (Eq. 33-34) | | probOr(probs) | Probabilistic OR via complement rule (Eq. 36-37) | | logOddsConjunction(probs, alpha?, weights?, gating?) | Log-odds conjunction with optional per-signal weights (Theorem 8.3) and ReLU/Swish gating (Theorems 6.5.3/6.7.4) | | balancedLogOddsFusion(sparse, dense, weight?) | Min-max normalized logit fusion for hybrid sparse-dense retrieval |

probNot accepts scalar (number) or array (number[]) inputs. Other fusion functions accept 1D (number[]) or batched 2D (number[][]) inputs. alpha accepts number, "auto" (resolves to 0.5, implementing the sqrt(n) scaling law from Paper 2, Theorem 4.2.1), or undefined. gating accepts "none" (default), "relu", or "swish".

LearnableLogOddsWeights

Learns per-signal reliability weights via Hebbian gradient descent (Remark 5.3.2).

new LearnableLogOddsWeights(nSignals, alpha?)

| Method | Description | |---|---| | combine(probs) | Fuse signals using current weights via logOddsConjunction() | | fit(signalsBatch, labels, options?) | Batch gradient descent on BCE loss | | update(signals, label, options?) | Online SGD with EMA gradients and Polyak averaging |

Properties: weights, averagedWeights, nSignals, alpha

FitOptions: learningRate, maxIterations, tolerance, useAveraged

UpdateOptions: learningRate, momentum, decayTau, maxGradNorm, avgDecay

AttentionLogOddsWeights

Learns query-dependent signal weights via linear attention projection (Paper 2, Section 8).

new AttentionLogOddsWeights(nSignals, nQueryFeatures, alpha?)

| Method | Description | |---|---| | combine(probs, queryFeatures, useAveraged?) | Fuse signals with query-dependent weights | | fit(signalsBatch, labels, queryFeatures, options?) | Batch gradient descent on BCE loss | | update(signals, label, queryFeatures, options?) | Online SGD with EMA gradients and Polyak averaging |

Properties: weightsMatrix, nSignals, nQueryFeatures, alpha

FitOptions: learningRate, maxIterations, tolerance, useAveraged

UpdateOptions: learningRate, momentum, decayTau, maxGradNorm, avgDecay

MultiFieldScorer

Multi-field BM25 scorer with per-field Bayesian probability fusion.

new MultiFieldScorer({ fields, fieldWeights?, alpha?, baseRate?, k1?, b?, method? })

| Method | Description | |---|---| | index(documents) | Build per-field BM25 indexes from Record<string, string[]>[] | | getProbabilities(queryTokens) | Dense fused probabilities across all documents | | retrieve(queryTokens, k?) | Top-k documents by fused probability | | addDocuments(newDocuments) | Incremental document addition |

Properties: numDocs, fields, fieldWeights

FusionDebugger

Traces intermediate values through the probability pipeline for debugging and explanation.

new FusionDebugger(transform)

| Method | Description | |---|---| | traceBM25(score, tf, docLenRatio) | Trace BM25 score through likelihood, prior, posterior | | traceVector(cosineSimilarity) | Trace cosine similarity through probability conversion | | traceNot(signalTrace) | Trace probabilistic negation of a signal | | traceFusion(signals, method?, weights?, alpha?) | Trace signal combination with method-specific intermediates | | traceDocument(score, tf, dlr, cos, method?, weights?, alpha?) | Full pipeline trace: BM25 + vector + fusion | | compare(traceA, traceB) | Compare two document traces to explain rank differences | | formatTrace(trace) | Human-readable trace output | | formatSummary(docTrace) | One-line pipeline summary | | formatComparison(comparison) | Side-by-side document comparison |

Fusion methods: "log_odds" (default), "prob_and", "prob_or", "prob_not"

Calibration Metrics

| Function | Description | |---|---| | expectedCalibrationError(probabilities, labels, nBins?) | Expected Calibration Error -- measures predicted vs actual relevance rates | | brierScore(probabilities, labels) | Brier score -- mean squared error between probabilities and labels | | reliabilityDiagram(probabilities, labels, nBins?) | Reliability diagram data: [avgPredicted, avgActual, count] per bin | | calibrationReport(probabilities, labels, nBins?) | One-call diagnostic: returns CalibrationReport with ECE, Brier, reliability, and summary() |

BM25

Low-level BM25 search engine supporting Robertson, Lucene, and ATIRE IDF variants.

| Method | Description | |---|---| | index(corpusTokens) | Build inverted index | | getScores(queryTokens) | Score all documents for a query | | retrieve(queryTokensBatch, k) | Top-k retrieval for multiple queries |

Citation

If you use this work, please cite the following papers:

@preprint{Jeong2026BayesianBM25,
  author    = {Jeong, Jaepil},
  title     = {Bayesian {BM25}: {A} Probabilistic Framework for Hybrid Text
               and Vector Search},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.18414940},
  url       = {https://doi.org/10.5281/zenodo.18414940}
}

@preprint{Jeong2026BayesianNeural,
  author    = {Jeong, Jaepil},
  title     = {From {Bayesian} Inference to Neural Computation: The Analytical
               Emergence of Neural Network Structure from Probabilistic
               Relevance Estimation},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.18512411},
  url       = {https://doi.org/10.5281/zenodo.18512411}
}

License

This project is licensed under the Apache License 2.0.

Copyright (c) 2023-2026 Cognica, Inc.