npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

usa-dl-ocr

v1.0.0

Published

Framework-agnostic USA driver license OCR extraction using Tesseract.js

Downloads

81

Readme

usa-dl-ocr

npm version License: MIT TypeScript

Framework-agnostic OCR extraction for USA driver licenses. Feed it an image, get structured data back.

Why This Exists

Parsing driver licenses shouldn't require vendor lock-in or framework coupling. This library provides a single, clean interface that works identically whether you're building a React app, an Express server, a CLI tool, or anything else. No adapters, no plugins, no boilerplate.

Installation

npm install usa-dl-ocr

For CLI usage:

npm install usa-dl-ocr commander

Quick Start

import { extract } from 'usa-dl-ocr';

const result = await extract(imageBlob);

console.log(result.data.licenseNumber);  // "D1234567"
console.log(result.data.dateOfBirth);    // "1990-01-15"
console.log(result.data.fullName);       // "JOHN DOE"

That's it. No setup, no configuration, no framework integration code.

API

extract(image, options?)

One-shot extraction. Creates a worker, processes the image, terminates the worker.

import { extract } from 'usa-dl-ocr';

// From file path (Node.js)
const result = await extract('/path/to/license.jpg');

// From URL
const result = await extract('https://example.com/license.jpg');

// From Blob/File (browser)
const result = await extract(fileInput.files[0]);

// From Buffer (Node.js)
const result = await extract(fs.readFileSync('license.jpg'));

// From ArrayBuffer or Uint8Array
const result = await extract(arrayBuffer);

DriverLicenseOCR Class

Same functionality, class-based interface if you prefer it.

import { DriverLicenseOCR } from 'usa-dl-ocr';

const ocr = new DriverLicenseOCR({ language: 'eng', timeout: 30000 });
const result = await ocr.extract(image);

createOCR(options?)

Persistent worker for batch processing. Significantly faster when processing multiple images.

import { createOCR } from 'usa-dl-ocr';

const ocr = await createOCR();

for (const image of images) {
  const result = await ocr.extract(image);
  // Process result
}

await ocr.terminate(); // Clean up when done

validate(data)

Check extraction completeness.

import { extract, validate } from 'usa-dl-ocr';

const { data } = await extract(image);
const validation = validate(data);

if (!validation.isValid) {
  console.log('Missing fields:', validation.issues);
  console.log('Completeness score:', validation.score);
}

Options

interface ExtractOptions {
  language?: string;      // Tesseract language code (default: 'eng')
  timeout?: number;       // Timeout in ms (default: 60000)
  psm?: number;          // Page segmentation mode (default: 3)
  oem?: number;          // OCR engine mode (default: 1)
  workerPath?: string;   // Custom Tesseract worker path
  corePath?: string;     // Custom Tesseract core path
  langPath?: string;     // Custom language data path
  onProgress?: (info: ProgressInfo) => void;
}

Extracted Fields

interface DriverLicenseData {
  fullName: string | null;
  firstName: string | null;
  middleName: string | null;
  lastName: string | null;
  licenseNumber: string | null;
  dateOfBirth: string | null;      // ISO format: YYYY-MM-DD
  expirationDate: string | null;   // ISO format: YYYY-MM-DD
  issueDate: string | null;        // ISO format: YYYY-MM-DD
  address: AddressInfo | null;
  sex: 'M' | 'F' | null;
  height: string | null;           // e.g., "5'10\""
  weight: string | null;           // e.g., "180 lbs"
  eyeColor: string | null;
  hairColor: string | null;
  issuingState: string | null;     // 2-letter code
  licenseClass: string | null;
  restrictions: string[];
  endorsements: string[];
  documentDiscriminator: string | null;
  rawText: string;
  confidence: number;              // 0-100
}

CLI

# Basic usage
usa-dl-ocr license.jpg

# Output formats
usa-dl-ocr license.jpg --format json
usa-dl-ocr license.jpg --format text
usa-dl-ocr license.jpg --format table

# Save to file
usa-dl-ocr license.jpg -o result.json

# Verbose with progress
usa-dl-ocr license.jpg -v

# Include raw OCR text
usa-dl-ocr license.jpg -r

# Custom timeout
usa-dl-ocr license.jpg -t 30000

Platform Support

Works anywhere JavaScript runs:

  • Node.js ≥18.0.0
  • Browsers: Chrome, Firefox, Safari, Edge (modern versions)
  • Electron
  • Deno (with Node compatibility)
  • Bun

Image Requirements

For best results:

  • Resolution: 300+ DPI
  • Format: JPEG, PNG, WebP, BMP, TIFF, GIF
  • Size: Under 10MB
  • Quality: Clear, in-focus, evenly lit
  • Orientation: Horizontal, minimal skew

Security

  • Input validation with size limits (10MB max)
  • URL validation for remote images
  • Text sanitization (control character removal)
  • No eval, no dynamic code execution
  • No network requests except to provided URLs
  • Timeout protection against hung operations

Performance Notes

  • First extraction takes longer due to Tesseract initialization (~2-5s)
  • Subsequent extractions with createOCR() are faster (~0.5-2s)
  • Higher resolution images take longer but produce better results
  • Consider resizing very large images client-side before processing

Limitations

  • OCR accuracy depends heavily on image quality
  • Damaged, faded, or obscured licenses may not parse correctly
  • Non-standard license formats may have lower extraction rates
  • Real-time processing not suitable for high-volume applications

Error Handling

import { extract } from 'usa-dl-ocr';

try {
  const result = await extract(image);
  
  if (!result.isDriverLicense) {
    // Image may not be a driver license
  }
  
  if (result.warnings.length > 0) {
    // Some fields couldn't be extracted
  }
} catch (error) {
  if (error.message.includes('timeout')) {
    // OCR took too long
  } else if (error.message.includes('exceeds maximum size')) {
    // Image too large
  } else if (error.message.includes('Unsupported')) {
    // Invalid input type
  }
}

Contributing

Issues and PRs welcome at github.com/sepehr-mohseni/usa-dl-ocr.

License

MIT © Sepehr Mohseni


Note: This library performs local OCR processing. No data is sent to external servers. All processing happens in-browser or in your Node.js process.