npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

js-document-autocapture

v1.0.6

Published

Free browser-native document auto-capture SDK with ML detection, perspective correction, and quality gates. No watermarks, no server uploads.

Readme

js-document-autocapture

The free, open-source document auto-capture SDK for the browser.

Detect, capture, and perspective-correct documents directly in the browser — powered by ML with computer-vision fallback. No watermarks. No server uploads. No hidden costs.

npm version License: MIT TypeScript Zero Dependencies

Live Demo · React Version · Report Bug


Why js-document-autocapture?

Most document-capture SDKs lock you into expensive licenses, server-side processing, or watermarked outputs. This SDK is different:

| | js-document-autocapture | Others | | -------------------------- | ------------------------------------------------- | ------------------------------ | | Price | Free forever (MIT) | $$$+ per month | | Privacy | 100% client-side — images never leave the browser | Cloud upload required | | Watermarks | None | Often watermarked on free tier | | Dependencies | Zero runtime deps — single ESM bundle | Heavy vendor lock-in | | ML detection | Built-in ML + OpenCV fallback | One approach only | | Perspective correction | GPU-accelerated (WebGL) with CPU auto-fallback | Varies |

Using React? Check out react-document-autocapture — ready-made hooks and components built on top of this SDK.


Features

  • ML-first detection with OpenCV/Hough fallback for difficult scenes
  • Automatic capture when the document is stable, well-framed, and passes quality gates
  • Perspective correction via GPU (WebGL) or CPU warp with strict validation
  • Manual corner adjustment support via quad coordinates in the capture result
  • Capture limits — set maxCaptures to stop after N captures, with a complete event when done
  • Event-driven API — subscribe to frame, detection, guidance, capture, complete, warning, error
  • Quality presetsfast, balanced, or high to control resolution, format, and stability
  • Zero runtime dependencies — all internals are bundled into a single ESM entry
  • Fully typed — complete TypeScript definitions included

Install

npm install js-document-autocapture
# or with yarn / pnpm
yarn add js-document-autocapture
pnpm add js-document-autocapture

ESM-only. Use import, not require.


Quick Start

import { createScanner } from 'js-document-autocapture';

const video = document.querySelector('video');

const scanner = createScanner({
  videoElement: video,
  autoCapture: true,
});

scanner.on('capture', (result) => {
  console.log(result.width, result.height, result.warpTierUsed);
  // result.blob       — captured image as a Blob
  // result.quad       — detected corner coordinates
  // result.elapsedMs  — total capture time
  const url = URL.createObjectURL(result.blob);
  document.querySelector('img').src = url;
});

scanner.on('guidance', (code) => {
  // 'DOCUMENT_NOT_FOUND' | 'MOVE_CLOSER' | 'HOLD_STEADY' | 'CAPTURING' | ...
  document.getElementById('hint').textContent = code;
});

await scanner.start();

That's it — three steps: create, listen, start.


Capture Limits (maxCaptures)

Set maxCaptures to automatically stop capturing after a fixed number of documents. Both auto-captures and manual captures count toward the limit.

const scanner = createScanner({
  videoElement: video,
  autoCapture: true,
  maxCaptures: 2, // capture exactly 2 documents
});

scanner.on('capture', (result) => {
  console.log(`Capture ${scanner.captureCount}`);
});

scanner.on('complete', (result) => {
  console.log(`All done! ${result.totalCaptures} documents captured`);
  console.log(result.captures); // array of all CaptureResult objects
  scanner.stop();
});

await scanner.start();

| Value | Behavior | | --------------------- | ----------------------------------------------------------------------------- | | undefined (default) | Unlimited captures — scanner never emits complete | | 0 | Unlimited (same as undefined) | | 1, 2, 3, … | Captures up to N documents, then emits complete and blocks further captures |

After reaching the limit, calling captureManual() will throw. Call start() again to reset the counter and begin a new session.


Configuration

const scanner = createScanner({
  // --- High-level controls ---
  detection: 'auto', // 'auto' | 'opencv' | 'ml' | 'hybrid'
  quality: 'balanced', // 'fast' | 'balanced' | 'high'
  autoCapture: true, // enable automatic capture on stable detection
  maxCaptures: 1, // stop after N captures (undefined = unlimited)
  webglWarp: true, // use GPU warp when available (CPU fallback automatic)
  mlFallback: true, // allow ML fallback when OpenCV misses
  cocoSsd: true, // COCO-SSD "book" detector for faster document finding
  postCaptureRefine: true, // enable safe post-capture corner refinement
  debug: false, // enable debug logging
  debugOverlay: 'off', // 'off' | 'basic' | 'full'

  // --- Required for camera ---
  videoElement: video, // the <video> element to use
});

Quality Presets

| Preset | Max Resolution | Format | JPEG Quality | Stability Frames | | ---------- | -------------- | ------ | ------------ | ---------------- | | fast | 1024 px | JPEG | 0.85 | 2 | | balanced | 1920 px | PNG | 0.92 | 3 | | high | 2048 px | PNG | 0.95 | 2 |

Preset values are defaults — any explicit field you set takes precedence.

Detection Modes

The SDK offers four detection strategies. Choose the one that best fits your use case:

| Mode | Strategy | Latency | Accuracy | Best For | | -------- | --------------------------------------------------- | ---------- | ----------- | ------------------------------------------ | | auto | ML-first + OpenCV fallback + COCO-SSD (recommended) | Low–Medium | Highest | General use — adapts to the scene | | ml | ML graph model only, no OpenCV fallback | Low | High | Clean backgrounds, controlled environments | | opencv | OpenCV Hough line detection only, no ML models | Lowest | Moderate | Low-end devices, no ML bundle desired | | hybrid | OpenCV primary + ML fallback when CV misses | Medium | High | Challenging lighting, busy backgrounds |

How Detection Works

Frame → [COCO-SSD "book" detector] ──→ coarse document region (fast, ~5–15 ms)
      → [ML graph model]           ──→ precise 4-corner quad  (accurate, ~20–40 ms)
      → [OpenCV / Hough]           ──→ line-based quad         (no model needed)

In auto mode (default), all enabled providers run in parallel each frame. The engine picks the highest-confidence result:

  1. COCO-SSD detects a rectangular "book"-class object as a coarse document proxy — very fast on WebGL.
  2. ML graph model (doc-corner-v2) predicts precise corner keypoints for perspective correction.
  3. OpenCV / Hough acts as the safety net when ML misses or isn't loaded yet.

COCO-SSD — Object-Level Detection

COCO-SSD is enabled by default (cocoSsd: true). It downloads a lightweight MobileNetV2 model (~5 MB) from the TensorFlow CDN on first use and runs inference via WebGL (with CPU fallback).

Why it matters:

  • Faster first-frame detection — COCO-SSD often finds the document within 1–2 frames, before the ML graph model has warmed up.
  • Robustness in clutter — object-level detection is less sensitive to shadows, glare, and busy backgrounds than edge-based methods.
  • Parallel execution — runs alongside the graph model at zero extra frame cost.

Tuning options:

const scanner = createScanner({
  cocoSsd: true, // enable/disable (default: true)
  cocoMinScore: 0.45, // minimum confidence threshold (0–1)
  cocoUseAsPrimaryInMlMode: true, // use COCO result as primary when ML is selected
});

Bandwidth note: The COCO-SSD model is fetched once from the TensorFlow CDN and cached by the browser. If your application requires zero external network requests, set cocoSsd: false to disable it.

Accuracy vs Speed Comparison

| Provider | Model Size | First Inference | Per-Frame (WebGL) | Corner Precision | Works Without GPU | | ---------------------------- | ---------------- | --------------- | ----------------- | ----------------- | ----------------- | | ML graph (doc-corner-v2) | 2.0 MB (bundled) | ~80–120 ms | ~20–40 ms | Sub-pixel | Yes (WASM) | | COCO-SSD (MobileNetV2) | ~5 MB (CDN) | ~150–250 ms | ~5–15 ms | Bounding-box | Yes (CPU) | | OpenCV / Hough | ~4 MB (bundled) | ~30–50 ms | ~8–20 ms | Line-intersection | Yes |

Timings measured on a mid-range laptop (M1 MacBook Air). Mobile devices may be 1.5–3× slower.

Choosing the Right Mode

  • auto (default) — Start here. The SDK combines all available providers, picks the best quad each frame, and gracefully degrades on weaker hardware. This gives you both the speed of COCO-SSD and the precision of the ML graph model.
  • ml — Use when you know the user has a capable device and want maximum corner accuracy without OpenCV overhead.
  • opencv — Use when you want the smallest possible bundle or need to avoid any ML model downloads (set cocoSsd: false too).
  • hybrid — Use when OpenCV works well for most of your documents but you need ML as a backup for tricky cases.

Events

scanner.on('frame', (frame) => {
  // Fires every processed frame
  // frame.detection — candidate quads, best match, source ('ml' | 'opencv')
  // frame.stability — { stable, stableMs }
  // frame.quality   — quality gate results
  // frame.guidance  — guidance code string
});

scanner.on('detection', (detection) => {
  // detection.bestCandidate — top quad with score
  // detection.candidates    — all candidate quads
  // detection.source        — 'ml' | 'opencv'
  // detection.status        — 'found' | 'not_found' | 'rejected'
});

scanner.on('capture', (result) => {
  // result.blob                 — captured image Blob (PNG or JPEG per quality preset)
  // result.width                — output width
  // result.height               — output height
  // result.quad                 — corner coordinates (topLeft, topRight, bottomRight, bottomLeft)
  // result.warpTierUsed         — 'webgl' | 'cpu' | 'raw'
  // result.captureDecisionSource — 'auto' | 'manual'
  // result.elapsedMs            — capture latency
});

scanner.on('complete', (result) => {
  // Fires when maxCaptures limit is reached
  // result.totalCaptures — number of captures completed
  // result.captures      — array of all CaptureResult objects from this session
});

scanner.on('guidance', (code) => {
  /* guidance code string */
});
scanner.on('warning', (message) => {
  /* non-fatal warning */
});
scanner.on('error', (error) => {
  /* Error object */
});
scanner.on('capabilities', (caps) => {
  /* browser capability report */
});

Session Control

await scanner.start(); // start camera + detection loop
await scanner.stop(); // stop detection, release camera
await scanner.destroy(); // full teardown, release all resources
const result = await scanner.captureManual(); // trigger manual capture
scanner.updateConfig({ quality: 'high' }); // update config at runtime
const caps = scanner.getCapabilities(); // { webgl, offscreen, worker, ... }

Utility Exports

import { detectCapabilities, selectExecutionMode } from 'js-document-autocapture';

// Check what the current browser supports
const caps = await detectCapabilities();
// caps.webgl, caps.offscreenCanvas, caps.webWorker, ...

// Get recommended execution mode for the environment
const mode = selectExecutionMode(caps);
// 'worker-gpu' | 'worker-cpu' | 'main-cpu' | ...

Type Exports

import type {
  ScannerConfig,
  ScannerSession,
  CaptureResult,
  CaptureCompleteResult, // { totalCaptures, captures }
  Detection, // 'auto' | 'opencv' | 'ml' | 'hybrid'
  Quality, // 'fast' | 'balanced' | 'high'
  Capabilities,
  ScannerEventMap,
  ScannerEventName,
  WarpTierUsed, // 'webgl' | 'cpu' | 'raw'
} from 'js-document-autocapture';

Security & Privacy

Your users' data stays with your users. Period.

  • 100% client-side processing — no images, frames, or metadata are ever transmitted to any server
  • No telemetry, no analytics, no tracking — the SDK phones home to absolutely nobody
  • Minimal external requests — COCO-SSD (cocoSsd: true by default) fetches a ~5 MB model from the TensorFlow CDN once and caches it. All other ML models and OpenCV modules are bundled. Set cocoSsd: false for zero network requests
  • Open source & auditable — every line of code is available for inspection under the MIT license
  • Zero runtime dependencies — no transitive supply-chain risk; what you install is what you get

This makes js-document-autocapture ideal for applications handling sensitive documents such as identity cards, passports, medical records, financial statements, and legal paperwork — where data residency and compliance (GDPR, HIPAA, SOC 2) are non-negotiable.


Browser Support

Modern browsers with ES2020+ support:

| Browser | Version | | -------------- | ------- | | Chrome / Edge | 90+ | | Firefox | 90+ | | Safari | 15+ | | Chrome Android | 90+ |

Requires: Web Workers, Canvas API
Recommended: WebGL (GPU-accelerated warp) — falls back to CPU automatically


Related Packages

| Package | Description | | ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | | react-document-autocapture | React hooks & components built on this SDK — drop-in <DocumentAutoCaptureCamera />, useDocumentAutoCapture hook, and <CornerAdjustModal /> |


Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Author

Built and maintained by Maaz Khan


License

MIT — free for personal and commercial use.