@raeven-co/sether-ner

v0.1.1

Published

10 days ago

Free-text NER redaction (names, organisations, locations) for Sether. Lazy-loaded ONNX model via transformers.js. Tokens restore through @raeven-co/sether's vault.

Downloads

270

0High
0Medium
0Low

godfreylebo

pii ner redaction sether llm privacy onnx transformers gdpr

@raeven-co/sether-ner

Free-text NER redaction — names, organisations, locations — for Sether. The part regex can't do, shipped as a separate, lazy-loaded package so the core stays ~35 KB.

The core Sether detectors catch structured PII (emails, cards, SSNs, keys) and label-anchored identity (Name:, DOB:). This package adds the hard part: unlabelled people, companies, and places in running prose — the thing a customer's own weekend build and the regex-only competitors can't replicate.

Why a separate package

The model + ONNX runtime is ~30 MB+. Keeping it out of the core means new Sether() stays tiny and dependency-light.
NER is async and runs on full outbound text (the prompt you're about to send), not the streaming response — so it's a different integration point than the sync, chunk-boundary-safe core detectors. This package is honest about that.
@huggingface/transformers is an optional peer dependency — install it only if you use the default model. Bring your own inferer (e.g. GLiNER) and you don't need it at all.

Install

npm install @raeven-co/sether-ner @huggingface/transformers

Use

import { Sether, redactSync, basicDetectors } from '@raeven-co/sether';
import { createNerRedactor } from '@raeven-co/sether-ner';

const sether = new Sether();
const ner = createNerRedactor();          // Xenova/bert-base-NER, lazy-loaded on first call

// Outbound path: NER first (names/orgs/locations), then structured PII — one vault.
const { redacted } = await ner.redact(userPrompt, { vault: sether.vault });
const safe = redactSync(redacted, { detectors: basicDetectors, vault: sether.vault });

// Send `safe` to the LLM. On the reply, sether.restore() swaps BOTH token sets
// back, because NER tokens use the same `<TYPE_uuid>` format the core restores.

NER tokens look like <NAME_…>, <ORG_…>, <LOCATION_…> and restore through the core's restore() / createRestoreStream() with no extra wiring.

Options

createNerRedactor({
  model: 'Xenova/bert-base-NER',  // any transformers.js token-classification model
  threshold: 0.6,                 // min confidence
  labels: ['NAME', 'ORG'],        // restrict which types to redact
  infer: myGlinerInferer,         // bring your own model / service / mock
});

Bring your own model (e.g. GLiNER)

infer is (text) => Promise<RawEntity[]> where each entity has { entity_group, score, start, end } (absolute char offsets). Map your model's output to that shape and the rest of the pipeline is identical — which also makes the whole redactor unit-testable without downloading a model.

Honest limitations

First call is slow — the model downloads (~30 MB+) and warms up. Call ner.warmup() at boot. ONNX inference runs ~50% slower than PyTorch.
Not on the streaming hot path. NER is a batched forward pass; it runs on the outbound prompt, not per-chunk on the response. Restoration of the response is the core's job (sync, chunk-boundary-safe).
Accuracy is model-bound. bert-base-NER is solid for Western names/orgs; multilingual and domain names need a better model (swap via model/infer). We'd rather you know that than oversell it.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@raeven-co/sether-ner

Why a separate package

Install

Use

Options

Bring your own model (e.g. GLiNER)

Honest limitations

License