image-pii-redactor

v0.8.1

Published

2 months ago

Client-side PII redaction for screenshot images. OCR + NER in the browser — your data never leaves your device.

0High
0Medium
0Low

athal7

pii redaction privacy ocr ner client-side browser image screenshot web-component tesseract transformers

image-pii-redactor

A Web Component that redacts personal information from AI chat screenshots entirely in the browser. No data ever leaves your device.

Try the live demo →

<pii-redactor></pii-redactor>

Upload a screenshot → PII is detected and highlighted → review and adjust → export a redacted PNG.

How it works

OCR — Tesseract.js extracts text and word-level bounding boxes from the image
PII detection — A multilingual NER model (onnx-community/multilang-pii-ner-ONNX) via Transformers.js identifies names, addresses, phone numbers, etc. Regex patterns cover structured PII (SSN, credit card, email, IP address) as a fallback
Review — An SVG overlay lets you toggle, add, or remove redaction boxes before exporting
Export — The final redacted PNG is rendered on a Canvas and returned as a Blob

Everything runs in the browser. The OCR engine, NER model, and image processing use WebAssembly and WebGPU — no server, no API call, no telemetry.

Install

npm install image-pii-redactor

Usage

As a Web Component

<script type="module">
  import 'image-pii-redactor';
</script>

<pii-redactor></pii-redactor>

The component self-registers as <pii-redactor>. Drop it anywhere — it works in plain HTML, React, Vue, Svelte, or any framework.

Listening for the result

const redactor = document.querySelector('pii-redactor');

redactor.addEventListener('redaction-confirm', (e) => {
  const { blob, entities, width, height } = e.detail;
  // blob: PNG Blob with redactions burned in
  // entities: array of { label, bbox, source } — no PII, just metadata
});

redactor.addEventListener('redaction-cancel', () => {
  console.log('User cancelled');
});

Configuration

<pii-redactor
  lang="eng"
  min-confidence="0.7"
  use-regex="true"
  max-file-size="20971520"
></pii-redactor>

Or via JavaScript:

redactor.config = {
  lang: 'eng',                                          // Tesseract language code
  nerModel: 'onnx-community/multilang-pii-ner-ONNX',   // HuggingFace model ID
  minConfidence: 0.7,                                   // NER confidence threshold
  useRegex: true,                                       // also run regex patterns
  maxFileSize: 20 * 1024 * 1024,                        // 20 MB
  memoryMode: 'auto',                                   // 'auto' | 'low' | 'normal'
};

memoryMode: 'auto' detects navigator.deviceMemory and uses sequential model loading (OCR → terminate → NER) on devices with less than 4 GB RAM.

Programmatic pipeline

Use the pipeline directly without the UI component:

import { analyzeImage, renderRedactedImage } from 'image-pii-redactor';

const result = await analyzeImage(imageBlob, {
  lang: 'eng',
  nerModel: 'onnx-community/multilang-pii-ner-ONNX',
  minConfidence: 0.7,
}, (progress) => console.log(progress.message));

// result.ocr       — full OCR text + word bboxes
// result.entities  — detected PII entities with char offsets
// result.redactions — proposed redaction boxes in pixel coords

const redactedBlob = await renderRedactedImage(imageBlob, result.redactions);

Service Worker (offline + privacy firewall)

After the first load, the component works fully offline. Register the included Service Worker to cache model files and optionally block all outbound network requests:

Copy node_modules/image-pii-redactor/public/pii-redactor-sw.js to your web root
Register it on page load:

import { registerServiceWorker } from 'image-pii-redactor';

await registerServiceWorker();

Once registered, the SW intercepts HuggingFace model downloads and caches them. After models are warm, you can enable the network firewall to block all external requests — verifiable proof that no image data leaves the browser:

navigator.serviceWorker.controller.postMessage({ type: 'ENABLE_FIREWALL' });

Privacy model

All processing is local. OCR, NER inference, and image rendering run entirely in the browser using WebAssembly (WASM) and optionally WebGPU.
Models are cached after the first download. Transformers.js and Tesseract.js both use the browser's Cache API and IndexedDB. Subsequent runs are instant and offline.
Airplane mode works. After the first run, disconnect from the internet and reload — the tool continues to function. This is the user-facing proof that nothing is server-dependent.
The Service Worker provides a hard network fence. When enabled, the SW blocks all non-cached outbound requests at the browser level, making it impossible for image data to be exfiltrated even by a compromised dependency.

Browser support

| Feature | Requirement | |---------|------------| | OCR (Tesseract.js WASM) | Chrome 89+, Firefox 89+, Safari 15+ | | NER (Transformers.js WASM) | Same as above | | NER (WebGPU acceleration) | Chrome 113+, Edge 113+ | | OffscreenCanvas (image pre-processing) | Chrome 69+, Firefox 105+ | | Web Components | All modern browsers |

Safari is supported but WebGPU acceleration is not available — inference falls back to WASM automatically.

Development

git clone https://github.com/athal7/image-pii-redactor
cd image-pii-redactor
npm install

npm run dev          # start demo at http://localhost:5173
npm test             # unit tests (Vitest, ~250ms)
npm run test:e2e:fast  # fast e2e tests, no model download needed
npm run build        # production library build

E2e tests that exercise the full model pipeline require the dev server to be running:

npm run dev &
npm run test:e2e

License

MPL-2.0 — Mozilla Public License 2.0. Modifications to library files must be published under the same license; combining with proprietary code in a larger work is permitted.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

image-pii-redactor

How it works

Install

Usage

As a Web Component

Listening for the result

Configuration

Programmatic pipeline

Service Worker (offline + privacy firewall)

Privacy model

Browser support

Development

License