npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

autoui-mcp

v0.1.4

Published

Desktop automation MCP Server – Rust port of pyautogui-mcp

Readme

autoui-mcp

Desktop automation MCP Server – Rust port of pyautogui-mcp.

Provides cross-platform UI automation capabilities through the Model Context Protocol (MCP).

Features

  • 🖱️ Mouse control (move, click, drag)
  • ⌨️ Keyboard input and shortcuts
  • 📸 Screenshot capture
  • 🧠 Intelligent action planning with candidate suggestions (using Qwen3-VL)
  • ✅ Vision-based result verification
  • 📋 Clipboard operations
  • 🪟 Window management

Installation

Global Installation

npm install -g autoui-mcp
# or
pnpm add -g autoui-mcp

Using with pnpm dlx (Recommended)

No installation needed! Use directly:

pnpm dlx autoui-mcp

Usage

With OpenCode

Add to your opencode.json:

{
  "mcp": {
    "autoui": {
      "type": "local",
      "enabled": true,
      "command": ["pnpm", "dlx", "autoui-mcp@latest"],
      "environment": {
        "QWEN_API_KEY": "your-api-key",
        "QWEN_BASE_URL": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "QWEN_MODEL": "qwen3-vl-flash"
      }
    }
  }
}

Standalone

# Run directly
autoui-mcp

# Or with pnpm dlx
pnpm dlx autoui-mcp

Environment Variables

  • QWEN_API_KEY: API key for Qwen vision model (required for vision-based features)
  • QWEN_BASE_URL: Base URL for Qwen API (optional, defaults to Alibaba Cloud)
  • QWEN_MODEL: Model to use (optional, defaults to qwen3-vl-flash)

Requirements

  • Rust toolchain (for building from source)
  • Node.js >= 18 (for npm package)

Available Tools

Vision Tools

auto_vision_plan

Intelligent action planning with candidates - Analyzes screen and intent, returns both AI-recommended action and all candidate UI elements.

Parameters:

  • intent (string, required): Task intent (e.g., "login to system", "click close button")
  • context (array of strings, optional): Previously executed operations

Returns:

{
  "intent": "login to system",
  "screen_size": {"width": 1920, "height": 1080},
  "action": {
    "action_type": "type",
    "target": {
      "label": "username field",
      "type": "input",
      "center": {"x": 500, "y": 300},
      "bbox": {"x": 400, "y": 280, "width": 200, "height": 40}
    },
    "params": {"text": "[email protected]"},
    "reasoning": "Found username input, should enter email first",
    "confidence": 0.95
  },
  "candidates": [
    {"label": "username field", "center": {"x": 500, "y": 300}, ...},
    {"label": "password field", "center": {"x": 500, "y": 350}, ...},
    {"label": "login button", "center": {"x": 500, "y": 400}, ...}
  ],
  "total_candidates": 3
}

Action types:

  • click: Click element (requires target + params.button/clicks)
  • type: Type text (requires target + params.text)
  • press: Press key (requires params.keys, e.g., 'enter', 'command+s')
  • scroll: Scroll (requires target or screen center + params.scroll_amount)
  • drag: Drag (currently not supported, suggest decomposing to click + move)
  • wait: Wait for loading (requires params.wait_reason)
  • done: Task completed

Usage:

  • For complex multi-step tasks: Follow AI recommendation (use action.action_type, action.target, action.params)
  • For simple single-step tasks: Choose from candidates and decide action yourself

auto_vision_verify

Result verification - Takes a screenshot and judges whether the current screen state matches the expected assertion.

Parameters:

  • assertion (string, required): Expected screen state description (e.g., "file successfully opened")
  • action_performed (string, optional): Description of just-performed action for context

Returns:

{
  "assertion": "window closed",
  "passed": true,
  "confidence": 0.95,
  "reason": "The window is no longer visible on screen"
}

Mouse Tools

  • auto_mouse_click: Click at coordinates (supports left/right/middle button, single/double-click)
  • auto_mouse_move: Move mouse to coordinates (hover)
  • auto_mouse_scroll: Scroll mouse wheel (positive=up, negative=down)
  • auto_mouse_drag: Drag from start to end coordinates

Keyboard Tools

  • auto_keyboard_type: Type text (all text is pasted via clipboard for reliability)
  • auto_keyboard_press: Press key or key combination (e.g., 'enter', 'ctrl+c', 'command+v')

Workflow: plan → act → verify

Step 1: Plan

Call auto_vision_plan to get AI recommendation and candidates:

result = auto_vision_plan(
    intent="login to system",
    context=[]  # or ["entered username", "clicked next"]
)

Step 2: Act

Option A: Follow AI recommendation (for complex tasks)

action = result.action
if action.action_type == "click":
    auto_mouse_click(
        x=action.target.center.x,
        y=action.target.center.y,
        clicks=action.params.clicks or 1
    )
elif action.action_type == "type":
    auto_mouse_click(x=action.target.center.x, y=action.target.center.y)
    auto_keyboard_type(text=action.params.text)
elif action.action_type == "press":
    auto_keyboard_press(keys=action.params.keys)
elif action.action_type == "scroll":
    if action.target:
        auto_mouse_click(x=action.target.center.x, y=action.target.center.y)
    auto_mouse_scroll(clicks=action.params.scroll_amount)

Option B: Choose from candidates (for simple tasks)

# Pick the first (most relevant) candidate
target = result.candidates[0]
auto_mouse_click(x=target.center.x, y=target.center.y)

Step 3: Verify

verify_result = auto_vision_verify(assertion="logged in successfully")
if not verify_result.passed:
    # Retry or handle error

Complete Examples

Example 1: Complex task (follow AI recommendation)

# Task: Login to system
context = []

# Plan first action
result = auto_vision_plan(intent="login to system", context=context)
# Execute recommendation
auto_mouse_click(x=result.action.target.center.x, y=result.action.target.center.y)
auto_keyboard_type(text=result.action.params.text)
context.append("Entered username")

# Plan next action
result = auto_vision_plan(intent="continue login", context=context)
# Execute
auto_mouse_click(x=result.action.target.center.x, y=result.action.target.center.y)
auto_keyboard_type(text="password")
context.append("Entered password")

# Continue until action_type == "done"

Example 2: Simple task (choose from candidates)

# Task: Click close button
result = auto_vision_plan(intent="close button", context=[])

# Choose from candidates
target = result.candidates[0]
auto_mouse_click(x=target.center.x, y=target.center.y)

# Verify
verify_result = auto_vision_verify(assertion="window closed")

License

MIT