npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

osctrl

v0.1.0-alpha

Published

Desktop Automation CLI for AI Coding Agents

Readme

OSCTRL - Desktop Automation CLI

Lightweight CLI for desktop automation, designed for AI coding agents.

Table of Contents

Overview

OSCTRL provides programmatic control of desktop operations through simple shell commands with JSON output. Designed to be context-efficient for AI agents - no SDK integration required, just shell commands with minimal token overhead.

Built on @nut-tree-fork/nut-js, it enables mouse control, keyboard input, window management, screen capture, clipboard operations, and process management.

Platform Support

| Platform | Status | Notes | | ------------- | ---------- | ---------------------------- | | Windows 10/11 | ✅ Full | Primary development platform | | macOS | ✅ Full | Intel and Apple Silicon | | Linux | 🔮 Planned | Future release |

Installation

Prerequisites

  • Operating System: Windows 10/11 or macOS (Intel/Apple Silicon)
  • Node.js: >= 18.0.0 (LTS recommended)
  • Permissions: Standard user (no admin/sudo required)

Install Dependencies

npm install

Global Installation (Optional)

npm link

After linking, you can use osctrl directly instead of node osctrl.mjs.

Quick Start

# Get desktop context (ALWAYS do this first for mouse operations)
node osctrl.mjs context

# Move mouse cursor
node osctrl.mjs mouse move 500 300

# Type text
node osctrl.mjs keyboard type "Hello World"

# Take screenshot
node osctrl.mjs screen capture --output screenshot.png

# Get active window
node osctrl.mjs window active

# Read clipboard
node osctrl.mjs clipboard read

# List processes
node osctrl.mjs process list

Command Reference

# Get complete desktop context (screen + mouse + active window)
osctrl context

# Get screen dimensions only
osctrl context-screen

# Get mouse position only
osctrl context-mouse

# Get active window info only
osctrl context-window

Example output for osctrl context:

{
  "success": true,
  "command": "context",
  "data": {
    "screen": { "width": 2560, "height": 1440 },
    "mouse": { "x": 500, "y": 300 },
    "activeWindow": {
      "title": "Visual Studio Code",
      "position": { "x": 100, "y": 50 },
      "size": { "width": 1200, "height": 800 }
    }
  }
}
# Run health check
osctrl health

Output Example:

{
  "success": true,
  "command": "health",
  "data": {
    "platform": "darwin",
    "platformName": "macOS",
    "nodeVersion": "v20.10.0",
    "osctrlVersion": "0.1.0-alpha",
    "checks": {
      "platform": true,
      "mouse": true,
      "screen": true,
      "node": true
    },
    "status": "healthy"
  }
}
# Move cursor to absolute position
osctrl mouse move <x> <y>
osctrl mouse move <x> <y> --from-resolution <width>x<height>  # Scale coordinates

# Move and click in one command
osctrl mouse click-at <x> <y> [--button left|right|middle]

# Click at current position
osctrl mouse click [--button left|right|middle]

# Double-click
osctrl mouse doubleclick [--button left|right|middle]

# Scroll wheel (positive=down, negative=up)
osctrl mouse scroll <amount>

# Drag from current position to target
osctrl mouse drag <x> <y> [--button left|right|middle]

# Get current cursor position
osctrl mouse position

Examples:

osctrl mouse move 1920 1080
osctrl mouse move 960 540 --from-resolution 1920x1080  # Scale from different resolution
osctrl mouse click-at 500 300                          # Move and click
osctrl mouse click --button right
osctrl mouse scroll 5
osctrl mouse drag 100 200
osctrl mouse position
# Type text string
osctrl keyboard type <text>

# Press and release single key
osctrl keyboard press <key>

# Execute key combination (hotkey)
osctrl keyboard hotkey <keys...>

# Press a key multiple times
osctrl keyboard repeat <key> <count> [--delay <ms>]

# Hold key down
osctrl keyboard hold <key>

# Release held key
osctrl keyboard release <key>

# Emergency reset - release all modifier keys
osctrl keyboard releaseall

Examples:

osctrl keyboard type "Hello World"
osctrl keyboard press enter
osctrl keyboard hotkey ctrl c           # Copy
osctrl keyboard hotkey ctrl v           # Paste
osctrl keyboard hotkey win r            # Run dialog (Windows)
osctrl keyboard hotkey ctrl shift esc   # Task Manager
osctrl keyboard repeat down 10          # Press down arrow 10 times
osctrl keyboard repeat tab 5 --delay 100  # Tab 5 times with 100ms delay
osctrl keyboard releaseall              # Fix stuck keys

Supported Key Names:

  • Modifiers: ctrl, alt, shift, win (aliases: control, super, cmd, meta)
  • Function: f1 through f24
  • Navigation: up, down, left, right, home, end, pageup, pagedown
  • Editing: enter, tab, space, backspace, delete, insert, esc
  • Letters: a through z (case-insensitive)
  • Numbers: 0 through 9
  • See lib/keys.mjs for complete key mappings
# Get active window information
osctrl window active

# List all windows with titles
osctrl window list

# Focus window by title
osctrl window focus <title> [--exact]

# Wait for window to appear
osctrl window wait <title> [--timeout <ms>] [--exact]

# Get window position and size
osctrl window bounds <title> [--exact]

# Check if window exists (error if not found)
osctrl window exists <title> [--exact]

# Get window state (minimized, maximized, normal)
osctrl window state <title> [--exact]

# Move window to position
osctrl window move <title> <x> <y> [--exact]

# Resize window
osctrl window resize <title> <width> <height> [--exact]

# Minimize window
osctrl window minimize <title> [--exact]

# Maximize window
osctrl window maximize <title> [--exact]

Window Matching:

  • Default: Partial match (case-insensitive) - "Notepad" matches "Untitled - Notepad"
  • With --exact: Exact match (case-sensitive)

Examples:

osctrl window active
osctrl window list
osctrl window focus "Notepad"
osctrl window wait "Save As" --timeout 5000   # Wait up to 5 seconds
osctrl window bounds "Chrome"
osctrl window exists "Calculator"
osctrl window state "Visual Studio Code"
osctrl window resize "Chrome" 1024 768
osctrl window maximize "Visual Studio Code" --exact
# Capture screenshot to file
osctrl screen capture [--output path] [--format png|jpg]

# Capture as base64 (for embedding)
osctrl screen capture --base64 [--format png|jpg]

# Capture specific region (x,y,width,height)
osctrl screen capture --region <x,y,w,h> [--output path]

# Get screen dimensions
osctrl screen size

# List all monitors
osctrl screen monitors

# Get pixel color at position (returns hex #RRGGBB)
osctrl screen pixel <x> <y>

# Wait until pixel matches color
osctrl screen wait-color <x> <y> <hex> [--timeout <ms>] [--tolerance <n>]

# Assert pixel matches color (fails if not)
osctrl screen assert-color <x> <y> <hex> [--tolerance <n>]

# Find first pixel matching color in region
osctrl screen find-color <hex> [--region <x,y,w,h>] [--tolerance <n>]

# Wait for screen content to change
osctrl screen wait-change [--timeout <ms>] [--region <x,y,w,h>]

# Compare current screen to baseline
osctrl screen diff [--baseline <path>] [--threshold <n>]

Examples:

osctrl screen capture --output screenshot.png
osctrl screen capture --base64 --format jpg
osctrl screen capture --region 100,100,800,600 --output region.png
osctrl screen size
osctrl screen monitors
osctrl screen pixel 500 300
osctrl screen wait-color 100 100 "#FF0000" --timeout 5000
osctrl screen assert-color 100 100 "#00FF00" --tolerance 10
osctrl screen find-color "#0000FF" --region 0,0,500,500
osctrl screen wait-change --timeout 3000
# Read clipboard text
osctrl clipboard read

# Write text to clipboard
osctrl clipboard write <text>

Examples:

osctrl clipboard read
osctrl clipboard write "Hello from CLI"
# Launch application (detached process)
osctrl process launch <command> [args...]

# List running processes
osctrl process list [--filter name]

# Kill process by PID
osctrl process kill <pid> [--force]

Examples:

osctrl process launch notepad.exe
osctrl process launch cmd.exe /k echo "Hello"
osctrl process list
osctrl process list --filter chrome
osctrl process kill 12345
osctrl process kill 12345 --force
# Pause execution for specified milliseconds
osctrl wait <ms>

# Execute multiple commands from JSON file
osctrl batch run <file> [--timeout <ms>]

Examples:

osctrl wait 1000                    # Wait 1 second
osctrl wait 500                     # Wait 500ms

osctrl batch run commands.json      # Run commands from file
osctrl batch run commands.json --timeout 30000  # With 30s timeout

Batch file format:

{
  "commands": [
    ["mouse", "move", "500", "300"],
    ["wait", "100"],
    ["mouse", "click"],
    ["keyboard", "type", "Hello"]
  ]
}

JSON Output Format

All commands return JSON for easy parsing by AI agents and scripts.

Success Response

{
  "success": true,
  "timestamp": "2026-01-13T12:00:00.000Z",
  "command": "mouse move",
  "data": {
    "x": 500,
    "y": 300
  }
}

Error Response

{
  "success": false,
  "timestamp": "2026-01-13T12:00:00.000Z",
  "command": "window focus",
  "error": "Window not found: Notepad",
  "errorCode": "WINDOW_NOT_FOUND"
}

Error Codes

  • WINDOW_NOT_FOUND - Specified window title not found
  • INVALID_COORDINATES - Coordinates out of range or invalid format
  • INVALID_KEY - Unrecognized key name
  • FILE_WRITE_ERROR - Cannot write to specified output path
  • TIMEOUT - Operation exceeded time limit
  • PROCESS_LAUNCH_FAILED - Failed to launch process
  • CLIPBOARD_ERROR - Clipboard operation failed
  • UNKNOWN_ERROR - Unexpected error occurred

System Requirements

Supported Platforms

  • Windows 10 (64-bit)
  • Windows 11 (64-bit)
  • macOS (Intel and Apple Silicon)

Node.js Versions

Tested with Node.js 18.x and 20.x LTS releases.

Known Limitations

Multi-Monitor Support

Status: Partial support

  • osctrl screen monitors lists available displays
  • Screen commands operate on primary monitor by default
  • Region capture with absolute coordinates may work across monitors

Window Operations

  • Minimize/Maximize: Uses platform-specific keyboard shortcuts
    • Windows: Win+Down / Win+Up
    • macOS: Cmd+M / Cmd+Ctrl+F (fullscreen)
  • Window List: Accesses internal provider registry to get window titles

Platform-Specific Implementations

  • Process listing: PowerShell (Windows) / ps aux (macOS)
  • Window shortcuts: Platform-aware keyboard combinations

Troubleshooting

Installation Issues

Problem: npm install fails with nut.js errors

Solutions:

  1. Verify Node.js version: node --version (must be >= 18.0.0)
  2. Try npm install --force
  3. Check for Visual C++ Redistributables
  4. See nut.js fork documentation

Runtime Issues

Problem: "Command not found" errors

Solution: Run with node osctrl.mjs prefix or use npm link for global installation

Problem: Mouse/keyboard commands not working

Solution: Ensure no other automation tools are interfering. Close conflicting applications.

Problem: Window commands can't find windows

Solution:

  • Use osctrl window list to see exact window titles
  • Try partial match without --exact flag
  • Some system windows may be inaccessible

AI Agent Integration

Why a CLI?

AI coding agents operate under context constraints. OSCTRL is designed for efficiency:

  • ~15 tokens per command (osctrl mouse click)
  • ~50 tokens per response (compact JSON)
  • No schema overhead - uses existing Bash/shell tools
  • Stateless - no connection management

AI-Specific Instructions

Ready-to-use instruction files for different AI assistants:

| AI Assistant | Instructions File | | ------------ | ---------------------------------------------------------------------------- | | Claude Code | docs/ai-instructions/CLAUDE.md | | Cursor | docs/ai-instructions/cursorrules.md | | OpenAI Codex | docs/ai-instructions/codex.md | | Gemini CLI | docs/ai-instructions/gemini.md |

Copy the relevant instructions into your AI assistant's configuration file.

Usage from AI Coding Agents

// Execute command via shell
const { execSync } = require("child_process");

const result = execSync("node osctrl.mjs mouse move 500 300", {
  encoding: "utf8",
});

const response = JSON.parse(result);
if (response.success) {
  console.log("Mouse moved to:", response.data);
} else {
  console.error("Error:", response.error);
}

Recommended Practices

  1. Always get context first - Run osctrl context before mouse operations
  2. Parse JSON output - All responses are valid JSON
  3. Check success field - Don't assume operations succeeded
  4. Handle error codes - Use errorCode for programmatic error handling
  5. Validate coordinates - Ensure they're within screen bounds
  6. Use exact match - Add --exact for window commands when title is known

Development

Running Tests

npm test

Code Formatting

npm run format

Documentation

License

Apache-2.0 - See LICENSE for details.

Credits

Built with:


Version: 0.1.0-alpha Platforms: Windows, macOS Target Users: AI Coding Agents, Developers, Test Automation