img-match
v0.0.3
Published
Perceptual image matching using dHash — detect duplicates, placeholders, and near-identical images
Maintainers
Readme
img-match
Detect placeholder images in large datasets using perceptual hashing (dHash). Tolerant of resolution changes and compression artifacts.
Install
npm install img-matchRequires Node.js 18+ and Sharp. This package is ESM-only — there is no CommonJS entry point.
Quick Start
import { PlaceholderDetector } from "img-match";
const detector = new PlaceholderDetector();
// Register your known placeholder images
await detector.addPlaceholder("https://cdn.example.com/placeholder.png", "default");
await detector.addPlaceholder("https://cdn.example.com/coming-soon.png", "coming-soon");
// Check if an item image is a placeholder
const result = await detector.isPlaceholder("https://cdn.example.com/items/widget.png");
if (result.isPlaceholder) {
console.log(`Matched placeholder: ${result.matchedPlaceholder}`);
console.log(`Confidence: ${result.confidence}`);
}Or use buffers directly (e.g., from a database or S3):
import { PlaceholderDetector } from "img-match";
import { readFile } from "node:fs/promises";
const detector = new PlaceholderDetector();
const placeholderBytes = await readFile("./placeholders/default.png");
await detector.addPlaceholder(placeholderBytes, "default");
const itemBytes = await readFile("./images/widget.png");
const result = await detector.isPlaceholder(itemBytes);API
PlaceholderDetector
new PlaceholderDetector(options?)
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| hashSize | HashSize | HashSize.BIT_64 | Hash size preset (see Hash Size Presets) |
| threshold | number | Preset default | Max Hamming distance to consider a match (integer from 0 to preset bit length) |
| concurrency | number | 8 | Max concurrent image fetches in checkMany (positive integer) |
Invalid option values throw a RangeError.
detector.addPlaceholder(image, label)
Accepts a URL string or a Buffer of image bytes. Computes the image's hash and registers it with the given label. When a Buffer is passed, no HTTP request is made.
await detector.addPlaceholder("https://cdn.example.com/placeholder.png", "no-image");detector.isPlaceholder(image)
Accepts a URL string or a Buffer. Checks the image against all registered placeholders. Returns a PlaceholderResult.
If no placeholders are registered, returns the standard non-match result without fetching the image.
Rejects if the image cannot be fetched (for URLs) or decoded.
const result = await detector.isPlaceholder("https://cdn.example.com/items/widget.png");detector.checkMany(images)
Checks multiple images concurrently, respecting the configured concurrency limit. Each element can be a URL string or a Buffer. Returns an array of PlaceholderResult in the same input order.
If no placeholders are registered, returns one standard non-match result per input without fetching or hashing any images.
If an individual image fails to fetch (for URLs) or decode, checkMany does not reject the whole call. Instead, that entry's result contains isPlaceholder: false, confidence: 0, distance: <preset max>, and an error message.
const results = await detector.checkMany([
"https://cdn.example.com/items/widget.png",
"https://cdn.example.com/items/gadget.png",
]);PlaceholderResult
{
isPlaceholder: boolean; // true if distance <= threshold
confidence: number; // 0 to 1 (1 = exact match)
matchedPlaceholder: string | null; // label of the matched placeholder, or null when no placeholder is within threshold
distance: number; // raw Hamming distance (0 to preset bit length)
error?: string; // present when checkMany could not process that image
}Hash Size Presets
The HashSize enum controls the hash bit length used for comparison. The project default is DEFAULT_HASH_SIZE (HashSize.BIT_64).
import { PlaceholderDetector, HashSize } from "img-match";
const detector = new PlaceholderDetector({ hashSize: HashSize.BIT_128 });| Preset | Bit Length | Grid / Layout | Hex Length | Default Threshold | Purpose |
|--------|-----------|---------------|-----------|-------------------|---------|
| BIT_64 | 64 | 9×8 horizontal | 16 | 10 | Fast placeholder detection — best for most use cases |
| BIT_128 | 128 | Horizontal + vertical concat | 32 | 20 | Higher accuracy when images share similar horizontal patterns |
| BIT_256 | 256 | 17×16 horizontal | 64 | 40 | Maximum discrimination for large or detailed placeholder sets |
Low-Level Utilities
These are exported for advanced use cases where you want to manage hashing and comparison yourself.
computeDHash(buffer, options?)
Computes a perceptual hash (dHash) from an image buffer.
import { computeDHash, HashSize } from "img-match";
const response = await fetch("https://cdn.example.com/image.png");
const buffer = Buffer.from(await response.arrayBuffer());
const hash64 = await computeDHash(buffer); // 16-char hex (default BIT_64)
const hash128 = await computeDHash(buffer, { hashSize: HashSize.BIT_128 }); // 32-char hex
const hash256 = await computeDHash(buffer, { hashSize: HashSize.BIT_256 }); // 64-char hexhammingDistance(a, b)
Computes the Hamming distance between two hex hash strings of the same length (16, 32, or 64 characters).
Throws a TypeError if either hash is not a valid hexadecimal string of a supported length, or if the two hashes have different lengths.
import { hammingDistance } from "img-match";
const dist = hammingDistance("a3f1b2c4d5e6f789", "a3f1b2c4d5e6f780");
// dist = 1 (one bit differs)How It Works
The package uses the dHash (difference hash) algorithm:
- Resize the image to the preset grid size (e.g., 9×8 for BIT_64)
- Convert to grayscale
- Compare adjacent pixels (horizontal, vertical, or both depending on preset)
- Encode the result as a hex string
Two images are compared by counting the number of differing bits (Hamming distance). Identical images have distance 0. The default threshold varies by preset (e.g., 10 for BIT_64) and represents the maximum number of differing bits to consider a match.
Tuning the Threshold
Each preset has a default threshold that works well for most cases. If you need to adjust:
- Lower threshold = stricter matching, fewer false positives
- Higher threshold = looser matching, fewer false negatives
- Use the
confidenceanddistancefields in the result to analyze your data and find the right value
License
MIT
