ocr-assert
v0.1.0
Published
Tolerant OCR assertions for UI testing
Maintainers
Readme
ocr-assert
Tolerant OCR assertions for UI testing.
ocr-assert helps reduce flaky OCR-based tests by normalizing common OCR confusions (like O ↔ 0, I/L ↔ 1, Z ↔ 2, S ↔ 5) and comparing text using a confusion-aware similarity score.
Status: early version (v0.1.0). API may evolve.
Why?
When you validate UI text via screenshots + OCR (canvas apps, PDFs, images, charts, games), tiny OCR mistakes can break otherwise-correct tests:
Orecognized as0IorLrecognized as1Zrecognized as2- extra / missing whitespace
ocr-assert makes assertions more tolerant while still failing when errors look random or too dense.
Install
Install (before npm publish)
Install directly from GitHub:
npm i github:hemanthk04/ocr-assert
## Quickstart
### 1) Assert two strings (already OCR’d)
```ts
import { assertOCR } from "ocr-assert";
assertOCR({
actual: "TOTAl 10O",
expected: "TOTAL 100",
threshold: 0.85, // optional (default: 0.85)
});2) OCR an image and assert
import { preprocessImage, extractText, assertOCR } from "ocr-assert";
const processed = await preprocessImage("./screenshot.png", {
crop: { left: 100, top: 200, width: 500, height: 120 },
grayscale: true,
contrast: 1.2,
});
const text = await extractText(processed);
assertOCR({
actual: text,
expected: "PAYMENT SUCCESSFUL",
});Playwright example
import { test, expect } from "@playwright/test";
import { preprocessImage, extractText, assertOCR } from "ocr-assert";
test("canvas receipt shows success", async ({ page }) => {
await page.goto("https://example.com");
const shot = await page.screenshot();
const processed = await preprocessImage(shot, {
grayscale: true,
contrast: 1.2,
});
const actual = await extractText(processed);
// Throws on failure (works well with expect().toThrow if you want)
assertOCR({ actual, expected: "SUCCESS" });
// Optional: still keep an explicit expect so test runners show an assertion step
expect(true).toBeTruthy();
});Tip: For best OCR accuracy, crop tightly to the text region.
API
assertOCR(options)
type AssertOptions = {
actual: string;
expected: string;
threshold?: number; // default: 0.85
};
function assertOCR(options: AssertOptions): void;- Normalizes both strings (uppercase, removes non-alphanumeric noise, applies OCR confusion normalization).
- Computes a confusion-aware similarity score.
- Uses an adaptive threshold in some cases (high confusion ratio with acceptable error density).
- Throws an error on failure with diagnostic metrics.
extractText(input)
function extractText(input: Buffer | string): Promise<string>;Runs OCR using tesseract.js (English) and returns trimmed text.
preprocessImage(input, options)
type PreprocessOptions = {
grayscale?: boolean;
contrast?: number; // 1 = normal
crop?: { left: number; top: number; width: number; height: number };
};
function preprocessImage(
input: Buffer | string,
options?: PreprocessOptions
): Promise<Buffer>;Uses sharp to optionally crop, grayscale, and adjust contrast.
normalizeText(input)
function normalizeText(input: string): string;Applies Unicode normalization, uppercasing, noise removal, OCR-safe canonicalization, and whitespace normalization.
similarity(a, b)
type SimilarityResult = {
score: number; // 0..1
confusionRatio: number; // 0..1 (how many mismatches are OCR-like)
errorDensity: number; // 0..1 (mismatches per length)
};
function similarity(a: string, b: string): SimilarityResult;Computes a weighted edit-distance where confusable substitutions are penalized less than random substitutions.
How it works (high level)
- Preprocess (optional): crop / grayscale / contrast
- OCR (optional): extract text via Tesseract
- Normalize: remove noise + convert common OCR confusions to canonical forms
- Compare: weighted distance + confusion-aware scoring
- Assert: fail only when similarity falls below the (possibly adaptive) threshold
Troubleshooting
- OCR output is empty / garbage: crop tighter, increase contrast slightly, ensure the text is large enough.
- False positives: raise the
threshold. - False negatives from OCR confusions: lower the
thresholda bit (or improve preprocessing).
Roadmap (suggested)
- Alignment-aware confusion stats
- Reusable Tesseract worker for speed
- Custom confusion maps and normalization rules
contains/regexstyle assertions
License
MIT
