npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@aspiresys/visor

v1.4.6

Published

Desktop visual automation framework using OpenCV, OCR, and desktop interaction APIs.

Downloads

4,443

Readme

Visor

Desktop Visual Automation Framework for Node.js and TypeScript.

Visor is a visual desktop automation framework that combines:

  • OpenCV image matching
  • OCR text recognition
  • Mouse & keyboard automation
  • Desktop application automation

Visor is designed for automating desktop workflows using visual interactions instead of traditional DOM/browser automation.


Features

  • OpenCV-based image matching
  • Multi-scale image matching
  • OCR automation using Tesseract
  • OCR occurrence indexing (beta)
  • Region OCR support
  • Automatic display scaling detection
  • Mouse automation
  • Region-based mouse automation
  • Region OCR support
  • Region-based mouse automation
  • Target offset support
  • Keyboard automation
  • Drag & drop support
  • Screenshot capture
  • Desktop application automation
  • OCR text searching
  • Wait APIs
  • Multi-image matching
  • Config-driven initialization
  • High-DPI display scaling support

What's New in 1.4.x

  • Automatic resolution-aware template matching
  • Template metadata support (.properties.json)
  • visor.version()
  • Region.capture()
  • Region.waitAnyImg()
  • Region.clickAny()
  • Improved DPI-aware matching
  • Faster image matching using predicted scaling

Installation

npm install @aspiresys/visor

Requirements

  • Windows
  • Node.js 18+
  • TypeScript

Visor Inspector

Visor includes an optional desktop Inspector tool for:

  • Capturing templates
  • Testing image matches
  • Measuring screen coordinates
  • Validating confidence thresholds

Run:

npx visor-inspector

Template Metadata

Visor Inspector automatically creates a .properties.json file alongside captured templates.

Example:

save.png save.properties.json

Metadata includes:

  • Captured resolution
  • Display scaling factor
  • Capture environment

Visor uses this metadata to predict the correct image scale during automation, significantly improving matching speed and reliability across machines.


Quick Start

import { visor, Region } from '@aspiresys/visor';

async function main() {
    visor.loadConfig({
        imagePath: './images',
        debug: true,
    });

    await visor.openApp('notepad');

    await visor.wait('notepad.png');

    await visor.click('notepad.png');

    await visor.type('Hello from Visor');
}

main();

Configuration

visor.loadConfig({
    imagePath: './images',
    debug: true,
});

Configuration Options

| Option | Description | | ------------ | ---------------------------------------- | | scaleFactor | Optional manual display scaling override | | imagePath | Default image directory | | outputPath | Screenshot output directory | | debug | Enable debug logging |


Display Scaling

Visor automatically detects Windows display scaling and adjusts mouse coordinates accordingly.

Common scaling values:

| Scaling | Value | | ------- | ----- | | 100% | 1.0 | | 125% | 1.25 | | 150% | 1.5 | | 175% | 1.75 | | 200% | 2.0 |

Manual override is still supported:

visor.loadConfig({
    scaleFactor: 1.5,
});

Multi-Scale Image Matching

Visor automatically performs multi-scale template matching to support:

  • Different Windows scaling settings
  • Different screen resolutions
  • High-DPI displays
  • Cross-machine execution

By default Visor evaluates templates across multiple scale levels and automatically selects the best match.

Supported environments include:

  • 050% scaling
  • 075% scaling
  • 100% scaling
  • 125% scaling
  • 150% scaling
  • 175% scaling
  • 200% scaling

This significantly improves image matching reliability when automation is executed across different machines.


Visual Automation APIs

Click Image

await visor.click('save.png');

Find Image

const region = await visor.find('icon.png');

Region-Based Automation

Regions can be obtained from:

  • visor.find()
  • visor.findAll()
  • visor.findText()
  • Visor Inspector

Move To Region

const region = await visor.find('save.png');

await visor.moveToRegion(region);

Click Region

const region = await visor.find('save.png');

await visor.clickRegion(region);

Double Click Region

await visor.doubleClickRegion(new Region(100, 200, 150, 50));

Right Click Region

await visor.rightClickRegion(new Region(100, 200, 150, 50));

Display scaling is automatically applied when using region-based APIs.


Region Object API

Regions returned by Visor are first-class objects that provide built-in automation methods.

Regions can be obtained from:

  • visor.find()
  • visor.findAll()
  • visor.findText()
  • Visor Inspector

Example:

const dialog = await visor.find('dialog.png');

const save = await dialog.find('save.png');

await save.click();

## Region.find()

Search for an image within the current region.
const dialog = await visor.find('dialog.png');

const save = await dialog.find('save.png');

## Region.findAll()

Find all image matches within the current region.
const dialog = await visor.find('dialog.png');

const buttons = await dialog.findAll('button.png');

## Region.exists()

Check whether an image exists within the current region.
const dialog = await visor.find('dialog.png');

const exists = await dialog.exists('save.png');

## Region.findText()

Search for text within the current region.
const dialog = await visor.find('dialog.png');

const submit = await dialog.findText('Submit');

## Region.existsText()

Check whether text exists within the current region.
const dialog = await visor.find('dialog.png');

const exists = await dialog.existsText('Success');

## Region.readText()

Extract OCR text from the current region.
const dialog = await visor.find('dialog.png');

const result = await dialog.readText();

console.log(result.text);

## Region.click()
const save = await visor.find('save.png');

await save.click();

## Region.doubleClick()
await save.doubleClick();

## Region.rightClick()
await save.rightClick();

## Region.move()
await save.move();

Check Image Exists

const exists = await visor.exists('login.png');

Wait For Image

await visor.wait('save.png');

await visor.wait('save.png', {
    confidence: 0.9,
    timeout: 10000,
});

Wait For Multiple Images

await visor.waitAny(['light-theme.png', 'dark-theme.png']);

Click Multiple Theme Variants

await visor.clickAny(['send-light.png', 'send-dark.png']);

Drag & Drop

await visor.dragDrop('source.png', 'target.png');

Hover

await visor.hover('menu.png');

Target Offsets

Target offsets allow mouse actions to be performed relative to the center of a matched image.

Useful for:

  • Dropdown arrows
  • Adjacent controls
  • Dynamic layouts
  • Composite UI elements

Click With Offset

await visor.click('search.png', 0.8, {
    x: 50,
    y: 0,
});

Hover With Offset

await visor.hover('menu.png', 0.8, {
    x: -20,
    y: 10,
});

Offsets are applied relative to the center of the matched region before display scaling adjustments are performed.


OCR Automation

Visor includes OCR automation powered by Tesseract.js.

OCR supports:

  • Full-screen OCR
  • Region OCR
  • Text search
  • Text clicking
  • Text waiting
  • OCR occurrence indexing

Read Screen

const result = await visor.readScreen();

console.log(result.text);

Read Region

const result = await visor.readRegion(new Region(100, 100, 500, 300));

console.log(result.text);

Find Text

const region = visor.findText('Submit');

Click Text

await visor.clickText('Login');

Wait For Text

await visor.waitText('Success');

OCR Occurrence Indexing

When the same text appears multiple times on screen, Visor allows selecting a specific occurrence.

await visor.clickText('Inbox', 0);
await visor.clickText('Inbox', 1);
await visor.clickText('Inbox', 2);

OCR elements are processed from:

Top → Bottom
Left → Right

This improves automation stability when multiple matching text elements exist on screen.


OCR Optimizations

Visor includes:

  • Shared OCR worker reuse
  • OCR preprocessing
  • Grayscale normalization
  • Image sharpening
  • Confidence filtering
  • OCR occurrence indexing

Benefits:

  • Faster OCR execution
  • Improved OCR accuracy
  • Lower memory usage
  • Improved framework stability

Mouse Automation

Move Mouse

await visor.moveMouse(500, 300);

Move To Inspector Region

await visor.moveToRegion(new Region(90, 61, 138, 69));

Region coordinates can be copied directly from Visor Inspector match results.


Scroll Down

await visor.scrollDown(1000);

Scroll Up

await visor.scrollUp(1000);

Mouse Position

const pos = await visor.getMousePosition();

Keyboard Automation

Type Text

await visor.type('Hello World');

Press Keys

await visor.press(visor.Key.LeftControl, visor.Key.S);

Screenshot Automation

await visor.captureScreenshot('./screenshots/home.png');

Desktop Application Automation

Open Application

await visor.openApp('notepad');

Close Application

await visor.closeApp('notepad.exe');

Confidence Thresholds

Supported range:

0.0 - 1.0

Recommended values:

| Confidence | Usage | | ---------- | --------------- | | 0.7 | Dynamic UI | | 0.8 | General usage | | 0.9 | Strict matching |


Performance Improvements

Visor includes:

  • Shared OCR worker reuse
  • Multi-scale image matching
  • OCR preprocessing pipeline
  • Automatic display scaling detection

These improvements increase reliability across varying display configurations and reduce OCR initialization overhead.


Troubleshooting

Image Not Found

Possible causes:

  • Incorrect image path
  • Low confidence threshold
  • Theme mismatch
  • Poor template quality

OCR Not Detecting Text

Possible causes:

  • Small fonts
  • Low contrast text
  • Blurry UI elements

Mouse Clicking Incorrect Position

Visor automatically detects Windows display scaling.

If required, manually override:

visor.loadConfig({
    scaleFactor: 1.5,
});

Roadmap

  • Match visualization overlay
  • Inspector coordinate picker
  • Multi-monitor support improvements
  • Parallel image matching
  • Advanced OCR tuning
  • Electron recorder
  • AI-assisted automation

Tech Stack

  • OpenCV
  • Tesseract.js
  • screenshot-desktop
  • sharp
  • nut.js

Why Visor?

Unlike Selenium or Playwright, Visor automates desktop applications using image recognition and OCR.

Works with:

  • Native Windows applications
  • Citrix environments
  • Remote desktops
  • Thick-client applications
  • Legacy systems