npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@hickeroar/tsbayes

v1.3.0

Published

A trainable Bayesian text classifier library and HTTP service.

Readme

tsbayes

A memory-based, optional-persistence naive Bayesian text classification package and web API for TypeScript/Node.js.

CI npm version License: MIT


Why?

Bayesian text classification is useful for things like spam detection,
sentiment determination, and general category routing.

You gather representative samples for each category, train the model,
then classify new text based on learned token patterns.

Once the model is trained, you can:
- classify input into a best-fit category
- inspect relative per-category scores
- persist and reload model state

Installation

Requires Node.js 20 or newer.

Package usage:

npm install @hickeroar/tsbayes

Contributor/development setup:

git clone <your-repo-url>
cd tsbayes
npm ci

Run as an API Server

See Running the Server for all run methods. CLI options override environment variables.

npm run dev

Running the Server

Global install

npm install -g @hickeroar/tsbayes
tsbayes
tsbayes --port 3000 --verbose

Local install (non-global)

npm install @hickeroar/tsbayes
npx tsbayes
# or
./node_modules/.bin/tsbayes

From source (npm run start)

npm run build
npm run start
npm run start -- --port 3000

Environment variables

TSBAYES_PORT=9000 TSBAYES_VERBOSE=true tsbayes

Help

tsbayes --help

CLI options

| Option | Short | Description | | ---------------------- | ----- | ----------------------------------------------- | | -h, --help | | Show help | | -v, --verbose | | Log requests, responses, and classifier details | | --host <host> | | Listen host (default: 0.0.0.0) | | --port <port> | | Listen port (default: 8000) | | --auth-token <token> | | Bearer token for API auth | | --language <lang> | | Stemmer language (default: english) | | --remove-stop-words | | Filter stop words |

Environment variables (overridden by CLI)

TSBAYES_HOST             default: "0.0.0.0"
TSBAYES_PORT             default: "8000"
TSBAYES_AUTH_TOKEN
TSBAYES_LANGUAGE         default: "english" — stemmer language for the server classifier
TSBAYES_REMOVE_STOP_WORDS   default: "false" — set to "true", "1", or "yes" to filter stop words
TSBAYES_VERBOSE          default: "false" — set to "true", "1", or "yes" for request/response logging

When TSBAYES_AUTH_TOKEN is configured, all API endpoints except /healthz and /readyz (and their query-string variants, e.g. /healthz?x=1) require:

Authorization: Bearer <token>

Use as a Library in Your App

Import and create a classifier:

import { TextClassifier, loadFromFile, saveToFile } from "@hickeroar/tsbayes";

const classifier = new TextClassifier();

classifier.train("spam", "buy now limited offer click here");
classifier.train("ham", "team meeting schedule for tomorrow");

const classification = classifier.classificationResult("limited offer today");
console.log(`category=${classification.category} score=${classification.score}`);

const scores = classifier.score("team schedule update");
console.log(scores);

classifier.untrain("spam", "buy now limited offer click here");

await saveToFile(classifier, "/tmp/tsbayes-model.json");
const loaded = new TextClassifier();
await loadFromFile(loaded, "/tmp/tsbayes-model.json");
console.log(loaded.classificationResult("limited offer today"));

Notes for library usage:

  • Classifier operations are safe for concurrent request handling in a single Node process.
  • Scores are relative values; compare scores within the same model.
  • Default tokenization applies Unicode NFKC normalization, lowercasing, non-word splitting, and English stemming.
  • Category names accepted by train and untrain match ^[-_A-Za-z0-9]{1,64}$.

Multi-language and stop words

Supported languages match snowball-stemmers exactly (e.g. english, spanish, french, german, tamil). Use supportedLanguages() to list them.

Create a custom tokenizer with createTokenizer({ language?, removeStopWords? }):

const tokenizer = createTokenizer({ language: "spanish", removeStopWords: true });
const classifier = new TextClassifier({ tokenizer });
// Or pass options directly:
const classifier = new TextClassifier({ language: "spanish", removeStopWords: true });

When using language or removeStopWords options (not a custom tokenizer), persisted models store the tokenizer config; loading restores it.

Errors

ValidationError and PersistenceError are thrown by:

  • train / untrain — invalid category names
  • saveToFile / loadFromFile — invalid paths or corrupted model state

Import and use for typed error handling:

import { TextClassifier, ValidationError, PersistenceError } from "@hickeroar/tsbayes";

try {
  await loadFromFile(classifier, path);
} catch (e) {
  if (e instanceof PersistenceError) {
    console.error("Model load failed:", e.message);
  }
}

File API notes:

  • saveToFile and loadFromFile default to /tmp/tsbayes-model.json when no path is provided.
  • Provided file paths must be absolute.

Development Checks

npm run format:check
npm run lint
npm run typecheck
npm run test:coverage
npm run build

Security checks used by CI:

npm audit --omit=dev --audit-level=high
npm audit --audit-level=high

Using the HTTP API

API Notes

  • Category names in /train/:category and /untrain/:category must match ^[-_A-Za-z0-9]{1,64}$.
  • Invalid category path params return 400 with {"error":"invalid request"}.
  • Request body size is capped at 1 MiB.
  • Error responses are JSON: {"error":"<message>"}.
  • If charset is declared in Content-Type, it must be UTF-8.
  • The HTTP service stores classifier state in memory; process restarts clear training data.
  • Empty request bodies are accepted for /train/:category and /untrain/:category; this is a no-op for model token tallies.

Common Error Responses

| Status | When | | ------ | ----------------------------------------------------------------------------------------------------------- | | 400 | Invalid payload or route params (for example non-UTF-8 charset, non-text body, or invalid category pattern) | | 401 | Missing/invalid bearer token when auth is enabled | | 404 | Invalid route | | 413 | Request body exceeds 1 MiB | | 500 | Unexpected server error |

Training the Classifier

Endpoint:
/train/:category
Example: /train/spam
Accepts: POST
Body: raw text/plain

Example:

curl -s -X POST "http://localhost:8000/train/spam" \
  -H "Content-Type: text/plain" \
  --data "buy now limited offer click here"

Untraining the Classifier

Endpoint:
/untrain/:category
Example: /untrain/spam
Accepts: POST
Body: raw text/plain

Getting Classifier Status

Endpoint:
/info
Accepts: GET

Example response:

{
  "categories": [
    {
      "category": "spam",
      "tokenTally": 6
    }
  ]
}

Classifying Text

Endpoint:
/classify
Accepts: POST
Body: raw text/plain

Example response:

{
  "category": "spam",
  "score": 3.2142857142857144
}

If no category can be selected (for example, untrained model), category is returned as null.

Scoring Text

Endpoint:
/score
Accepts: POST
Body: raw text/plain

Example response:

{
  "spam": 3.2142857142857144,
  "ham": 0.7857142857142857
}

Flushing Training Data

Endpoint:
/flush
Accepts: POST
Body: raw text/plain (optional)

Health and Readiness

Liveness endpoint
/healthz
Accepts: GET
Readiness endpoint
/readyz
Accepts: GET

/healthz and /readyz (including query-string variants like /healthz?x=1) are intentionally unauthenticated even when API auth is enabled.

Operational Notes

  • The HTTP server is in-memory by default; deploys/restarts wipe trained state.
  • Use saveToFile and loadFromFile in library workflows to persist/reload model state.
  • /readyz returns 200 while accepting traffic and 503 when draining during shutdown.