gheim
v0.1.7
Published
PII round-trip for LLM APIs: anonymize before the request, de-anonymize the stream on the way back.
Downloads
314
Maintainers
Readme
Detect PII in text, substitute it with stable sentinels (<PERSON_1>,
<EMAIL_2>, ...), send the redacted text to any LLM, and restore the
originals on the way back, including in streamed responses. The package
is framework-agnostic and ships a drop-in openai client wrapper for
zero-effort integration.
See the monorepo README for the cross-language overview and architecture.
Install
npm install gheim # or: bun add gheim
npm install gheim openai # add the drop-in OpenAI client
npm install gheim @huggingface/transformers # add on-device detectionopenai and @huggingface/transformers are optional peer dependencies.
The core package itself has no runtime dependencies.
Model choice
LocalDetector loads a token-classification model via
@huggingface/transformers
(transformers.js). The package's default model is
joelbarmettler/gheim-ch-560m
— a 560M xlm-roberta-large fine-tune optimised for Swiss-market PII
(test strict F1 0.910, char F1 0.946 on Swiss text, see
MODEL_CARD.md).
Any HuggingFace token-classification model that emits the same 33-class
BIOES schema can be substituted via model:.
| Model | Best for | Parameters | Notes |
|---|---|---:|---|
| joelbarmettler/gheim-ch-560m (default) | Production / commercial. Swiss court / parliament / web text with CH-format account numbers (IBAN, AHV, VAT-CHE) | 560M | Apache 2.0. Test strict F1 0.910, char F1 0.946. Hub repo ships both fp32 and onnx/model_quantized.onnx — load with dtype: "q8" for browser deployment. |
| joelbarmettler/gheim-ch-560m-research | Research / non-commercial. Stronger cross-domain transfer on Swiss-news text (swissner PER char F1 0.90 vs 0.70 on the default) | 560M | CC BY-NC-SA 4.0 + Reuters research-only rider. safetensors only — no browser-targeted ONNX. |
| openai/privacy-filter | English-first or general use, long-context (up to 128k tokens) | 1.4B (50M active, MoE) | Apache 2.0. Wider language coverage, larger weights. |
import { LocalDetector } from "gheim";
// Default — Swiss-tuned, q8 ONNX. `device: "auto"` probes WebGPU in
// the browser and falls back to WASM; in Node it's WASM directly.
const det = new LocalDetector({ dtype: "q8" });
// Stronger cross-domain transfer (research, non-commercial; safetensors only):
const detResearch = new LocalDetector({ model: "joelbarmettler/gheim-ch-560m-research" });
// Alternative for English or general use:
const detEn = new LocalDetector({ model: "openai/privacy-filter" });Drop-in OpenAI client
import { OpenAI } from "gheim/openai";
// Same constructor shape as `openai`. `apiKey`, `baseURL`, etc. work
// at the top level (forwarded to the inner client). gheim-specific
// keys: `gheimDetector`, `gheimStrict`, `openaiClient`, `clientOptions`.
const client = new OpenAI();
const r = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hi, my name is Joel" }],
});
// r.choices[0].message.content contains "Joel".
// OpenAI only ever saw "<PERSON_1>".Custom endpoint or key (e.g. OpenRouter, local vLLM):
import { OpenAI } from "gheim/openai";
const client = new OpenAI({
apiKey: process.env.OPENROUTER_API_KEY,
baseURL: "https://openrouter.ai/api/v1",
});Streaming:
const stream = await client.chat.completions.create({ ..., stream: true });
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta.content ?? "");
}Per-call overrides:
import { Session } from "gheim";
const session = new Session(); // reuse across calls
const r = await client.chat.completions.create({
model: "gpt-4o",
messages: [...],
gheimSession: session, // or gheimDetector
});All non-chat OpenAI resources are on client.raw once await client.ready() resolves.
Framework-agnostic
import { Session, LocalDetector, anonymizeText, deanonymizeText } from "gheim";
const session = new Session();
(session as any).detector = new LocalDetector({ dtype: "q8" });
// ^ defaults to joelbarmettler/gheim-ch-560m
const clean = await anonymizeText("Hi, my name is Joel", session);
// ... call any LLM with clean ...
const final = deanonymizeText(responseText, session);Streaming deanonymizer for any text-chunk source:
import { deanonymizeStream } from "gheim";
for await (const chunk of deanonymizeStream(myChunkIterator, session)) {
process.stdout.write(chunk);
}OpenAI-typed helpers when you want manual control:
import { anonymizeOpenAIMessages, deanonymizeOpenAIStream } from "gheim/openai";Wrapped endpoints
The drop-in OpenAI client automatically protects every text-carrying endpoint:
chat.completions, responses, completions (legacy), embeddings,
moderations, audio.speech, audio.transcriptions, audio.translations,
images.generate, images.edit. Tool-call arguments and SSE delta chunks are
restored on the way back. See the
monorepo README for the full
coverage matrix and the embeddings caveat.
Strict mode
gheimStrict: true (default) throws GheimStrictError if you call an
unwrapped endpoint (beta, batches, files, uploads, fineTuning,
vectorStores). The error names client.raw.<path> as the escape hatch.
const client = new OpenAI({ gheimStrict: false }); // downgrade to one-time warnings
await client.ready();
client.raw.beta.assistants.create(...); // always works regardless of strict modeDetectors
import { LocalDetector, RemoteDetector, defaultDetector } from "gheim";
// Local: via @huggingface/transformers, on Node, Bun, and browsers.
// `device: "auto"` (default) probes navigator.gpu in the browser and
// falls back to WASM if WebGPU isn't usable; in Node it's WASM only.
const local = new LocalDetector({
device: "auto",
dtype: "q8",
onProgress: (e) => {
if (e.fraction != null) console.log(`loading: ${(e.fraction * 100) | 0}%`);
},
});
await local.load();
console.log("backend in use:", local.actualDevice); // "webgpu" or "wasm"
// Remote: against your own gheim-server or api.gheim.ch.
const remote = new RemoteDetector({
baseUrl: "http://your-host:8080",
apiKey: "...",
});
// Picks remote when GHEIM_API_KEY is set in the environment, else local.
const auto = defaultDetector();Runtime support
Tested on Node 18+, Bun 1.1+, and modern browsers (Chrome 113+, Edge, Safari 17+, Firefox via flag). Browser builds use WebGPU when available and fall back to WebAssembly automatically.
For Node servers, installing onnxruntime-node alongside
@huggingface/transformers enables the native ONNX backend, which is
typically 5-10× faster than the default WebAssembly backend.
Artifacts
Ships as dual ESM and CJS with full .d.ts typings:
dist/esm/{index,openai}.js
dist/cjs/{index,openai}.cjs
dist/types/**/*.d.tsComposite detector (recommended for production)
For categories where structure is verifiable by checksum (CH-IBAN, AHV,
VAT-CHE, credit cards, common token formats), the Python sibling
package ships a composite detector that pairs the regex catalogue with
the model. The JavaScript package implements the model side; combine
with your own regex layer or proxy through gheim-server, which
applies the composite detector internally.
License
Apache 2.0. Bundled model weights are inherited from the upstream license of the model you select.
