@raeven-co/sether-ner
v0.1.1
Published
Free-text NER redaction (names, organisations, locations) for Sether. Lazy-loaded ONNX model via transformers.js. Tokens restore through @raeven-co/sether's vault.
Downloads
270
Maintainers
Readme
@raeven-co/sether-ner
Free-text NER redaction — names, organisations, locations — for Sether. The part regex can't do, shipped as a separate, lazy-loaded package so the core stays ~35 KB.
The core Sether detectors catch structured PII (emails, cards, SSNs, keys) and
label-anchored identity (Name:, DOB:). This package adds the hard part:
unlabelled people, companies, and places in running prose — the thing a
customer's own weekend build and the regex-only competitors can't replicate.
Why a separate package
- The model + ONNX runtime is ~30 MB+. Keeping it out of the core means
new Sether()stays tiny and dependency-light. - NER is async and runs on full outbound text (the prompt you're about to send), not the streaming response — so it's a different integration point than the sync, chunk-boundary-safe core detectors. This package is honest about that.
@huggingface/transformersis an optional peer dependency — install it only if you use the default model. Bring your own inferer (e.g. GLiNER) and you don't need it at all.
Install
npm install @raeven-co/sether-ner @huggingface/transformersUse
import { Sether, redactSync, basicDetectors } from '@raeven-co/sether';
import { createNerRedactor } from '@raeven-co/sether-ner';
const sether = new Sether();
const ner = createNerRedactor(); // Xenova/bert-base-NER, lazy-loaded on first call
// Outbound path: NER first (names/orgs/locations), then structured PII — one vault.
const { redacted } = await ner.redact(userPrompt, { vault: sether.vault });
const safe = redactSync(redacted, { detectors: basicDetectors, vault: sether.vault });
// Send `safe` to the LLM. On the reply, sether.restore() swaps BOTH token sets
// back, because NER tokens use the same `<TYPE_uuid>` format the core restores.NER tokens look like <NAME_…>, <ORG_…>, <LOCATION_…> and restore through the
core's restore() / createRestoreStream() with no extra wiring.
Options
createNerRedactor({
model: 'Xenova/bert-base-NER', // any transformers.js token-classification model
threshold: 0.6, // min confidence
labels: ['NAME', 'ORG'], // restrict which types to redact
infer: myGlinerInferer, // bring your own model / service / mock
});Bring your own model (e.g. GLiNER)
infer is (text) => Promise<RawEntity[]> where each entity has
{ entity_group, score, start, end } (absolute char offsets). Map your model's
output to that shape and the rest of the pipeline is identical — which also makes
the whole redactor unit-testable without downloading a model.
Honest limitations
- First call is slow — the model downloads (~30 MB+) and warms up. Call
ner.warmup()at boot. ONNX inference runs ~50% slower than PyTorch. - Not on the streaming hot path. NER is a batched forward pass; it runs on the outbound prompt, not per-chunk on the response. Restoration of the response is the core's job (sync, chunk-boundary-safe).
- Accuracy is model-bound.
bert-base-NERis solid for Western names/orgs; multilingual and domain names need a better model (swap viamodel/infer). We'd rather you know that than oversell it.
License
MIT © Godfrey Lebo / Raeven Company LTD
