ocr-assert

v0.1.0

Published

8 days ago

Tolerant OCR assertions for UI testing

0High
0Medium
0Low

hemanthk04

ocr testing playwright assert ui-testing

ocr-assert

Tolerant OCR assertions for UI testing.

ocr-assert helps reduce flaky OCR-based tests by normalizing common OCR confusions (like O ↔ 0, I/L ↔ 1, Z ↔ 2, S ↔ 5) and comparing text using a confusion-aware similarity score.

Status: early version (v0.1.0). API may evolve.

Why?

When you validate UI text via screenshots + OCR (canvas apps, PDFs, images, charts, games), tiny OCR mistakes can break otherwise-correct tests:

O recognized as 0
I or L recognized as 1
Z recognized as 2
extra / missing whitespace

ocr-assert makes assertions more tolerant while still failing when errors look random or too dense.

Install

Install (before npm publish)

Install directly from GitHub:

npm i github:hemanthk04/ocr-assert

## Quickstart

### 1) Assert two strings (already OCR’d)

```ts
import { assertOCR } from "ocr-assert";

assertOCR({
  actual: "TOTAl 10O",
  expected: "TOTAL 100",
  threshold: 0.85, // optional (default: 0.85)
});

2) OCR an image and assert

import { preprocessImage, extractText, assertOCR } from "ocr-assert";

const processed = await preprocessImage("./screenshot.png", {
  crop: { left: 100, top: 200, width: 500, height: 120 },
  grayscale: true,
  contrast: 1.2,
});

const text = await extractText(processed);

assertOCR({
  actual: text,
  expected: "PAYMENT SUCCESSFUL",
});

Playwright example

import { test, expect } from "@playwright/test";
import { preprocessImage, extractText, assertOCR } from "ocr-assert";

test("canvas receipt shows success", async ({ page }) => {
  await page.goto("https://example.com");

  const shot = await page.screenshot();

  const processed = await preprocessImage(shot, {
    grayscale: true,
    contrast: 1.2,
  });

  const actual = await extractText(processed);

  // Throws on failure (works well with expect().toThrow if you want)
  assertOCR({ actual, expected: "SUCCESS" });

  // Optional: still keep an explicit expect so test runners show an assertion step
  expect(true).toBeTruthy();
});

Tip: For best OCR accuracy, crop tightly to the text region.

API

`assertOCR(options)`

type AssertOptions = {
  actual: string;
  expected: string;
  threshold?: number; // default: 0.85
};

function assertOCR(options: AssertOptions): void;

Normalizes both strings (uppercase, removes non-alphanumeric noise, applies OCR confusion normalization).
Computes a confusion-aware similarity score.
Uses an adaptive threshold in some cases (high confusion ratio with acceptable error density).
Throws an error on failure with diagnostic metrics.

`extractText(input)`

function extractText(input: Buffer | string): Promise<string>;

Runs OCR using tesseract.js (English) and returns trimmed text.

`preprocessImage(input, options)`

type PreprocessOptions = {
  grayscale?: boolean;
  contrast?: number; // 1 = normal
  crop?: { left: number; top: number; width: number; height: number };
};

function preprocessImage(
  input: Buffer | string,
  options?: PreprocessOptions
): Promise<Buffer>;

Uses sharp to optionally crop, grayscale, and adjust contrast.

`normalizeText(input)`

function normalizeText(input: string): string;

Applies Unicode normalization, uppercasing, noise removal, OCR-safe canonicalization, and whitespace normalization.

`similarity(a, b)`

type SimilarityResult = {
  score: number;          // 0..1
  confusionRatio: number; // 0..1 (how many mismatches are OCR-like)
  errorDensity: number;   // 0..1 (mismatches per length)
};

function similarity(a: string, b: string): SimilarityResult;

Computes a weighted edit-distance where confusable substitutions are penalized less than random substitutions.

How it works (high level)

Preprocess (optional): crop / grayscale / contrast
OCR (optional): extract text via Tesseract
Normalize: remove noise + convert common OCR confusions to canonical forms
Compare: weighted distance + confusion-aware scoring
Assert: fail only when similarity falls below the (possibly adaptive) threshold

Troubleshooting

OCR output is empty / garbage: crop tighter, increase contrast slightly, ensure the text is large enough.
False positives: raise the threshold.
False negatives from OCR confusions: lower the threshold a bit (or improve preprocessing).

Roadmap (suggested)

Alignment-aware confusion stats
Reusable Tesseract worker for speed
Custom confusion maps and normalization rules
contains / regex style assertions

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ocr-assert

Why?

Install

Install (before npm publish)

2) OCR an image and assert

Playwright example

API

assertOCR(options)

extractText(input)

preprocessImage(input, options)

normalizeText(input)

similarity(a, b)

How it works (high level)

Troubleshooting

Roadmap (suggested)

License

`assertOCR(options)`

`extractText(input)`

`preprocessImage(input, options)`

`normalizeText(input)`

`similarity(a, b)`