npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

app-screen-mcp

v1.0.0

Published

MCP server for iOS Simulator automation — accessibility tree, screenshot, tap/swipe/type, and AI screen perception tools

Readme

Why app-screen-mcp

Most mobile AI automation fails for one reason: it acts blind.

app-screen-mcp solves that by combining:

  • Structured accessibility data (idb ui describe-all)
  • Real simulator screenshots (xcrun simctl io ... screenshot)
  • Direct simulator actions (tap, type, swipe, hardware buttons)

Result: agents that can understand screen state before acting, then execute deterministic interactions.

What You Can Do

  • Build autonomous QA flows for iOS simulators
  • Run AI-driven smoke tests without brittle selectors
  • Automate onboarding/login/payment demos from natural language
  • Create self-healing UI scripts that use labels instead of fixed coordinates
  • Feed accessibility tree + screenshot to multimodal models for stronger reasoning

How It Works

AI Agent / MCP Client
        |
        v
   app-screen-mcp
        |
        +--> idb (UI tree + gestures + text + buttons)
        |
        +--> xcrun simctl (device lifecycle + screenshots + app launch)
        |
        v
   iOS Simulator

Feature Highlights

  • Full simulator discovery and boot control
  • App launch by bundle ID
  • Accessibility-first perception via normalized UI elements
  • Screenshot capture with resize and JPEG quality controls
  • Hash-based unchanged-image suppression to save tokens
  • tap_text for semantic interaction by visible label
  • tap_relative for resolution-independent tapping (for example 0.5, 0.5 = center)
  • get_screen_summary for one-call AI context (tree + screenshot)
  • Safe text input escaping in shell execution path
  • Tooling designed for Claude Desktop, Cursor, and any MCP-compatible client

Tool Catalog

| Tool | Purpose | |---|---| | list_simulators | List available simulators and current boot state | | boot_simulator | Boot a simulator by UDID | | launch_app | Launch an installed app by bundle_id | | get_ui_tree | Return full normalized accessibility tree | | take_screenshot | Return JPEG image with max_dim, quality, and unchanged-image suppression | | get_screen_summary | Return UI tree plus optional screenshot (include_image, compact_tree, image hash metadata) | | tap | Tap exact (x, y) coordinates | | tap_relative | Tap relative (rx, ry) in [0,1] (0.5, 0.5 is center) | | type_text | Type into currently focused field | | swipe | Swipe between two points with optional duration | | press_button | Press HOME, LOCK, SIDE_BUTTON, or SIRI | | find_elements | Search UI elements by label/value/hint text | | tap_text | Find first matching element by text and tap its center |

Token-Efficient Usage

Start with tree-only context, then request an image only when needed:

{
  "name": "get_screen_summary",
  "arguments": {
    "include_image": false,
    "compact_tree": true
  }
}

When image is needed, compress it:

{
  "name": "get_screen_summary",
  "arguments": {
    "include_image": true,
    "max_dim": 720,
    "quality": 55
  }
}

Skip resending unchanged screenshots:

{
  "name": "get_screen_summary",
  "arguments": {
    "include_image": true,
    "only_if_changed": true,
    "previous_image_hash": "<last_hash>"
  }
}

Use relative taps when acting from image coordinates:

{
  "name": "tap_relative",
  "arguments": {
    "rx": 0.5,
    "ry": 0.5
  }
}

Prerequisites

  • macOS with Xcode + iOS Simulator
  • Node.js 18+
  • idb tooling
brew tap facebook/fb
brew install idb-companion
pip3 install fb-idb

Installation

git clone https://github.com/xmuweili/app-screen-mcp.git
cd app-screen-mcp
npm install
npm run build

Configure Your MCP Client

Claude Desktop

~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "ios-simulator": {
      "command": "node",
      "args": ["/absolute/path/to/app-screen-mcp/dist/index.js"]
    }
  }
}

Cursor / VS Code MCP

{
  "mcp.servers": {
    "ios-simulator": {
      "command": "node",
      "args": ["/absolute/path/to/app-screen-mcp/dist/index.js"]
    }
  }
}

Restart your MCP client after updating config.

Avoid Repeated Permission Prompts

Prompt behavior is controlled by the MCP client, not this server.

Most GUI MCP clients (Claude Desktop, Cursor, Windsurf, Zed, Continue.dev) usually treat adding the server to config as trust grant, so you should not see repeated tool approvals.

Claude Code (CLI)

Allow this server's tools in ~/.claude/settings.json:

{
  "permissions": {
    "allow": [
      "mcp__ios-simulator__*"
    ]
  }
}

ios-simulator must match the server name in your MCP config.

Use .claude/settings.json in project root if you want this scoped per-repo.

Codex CLI

Codex uses command-level approval. To avoid repeated prompts:

  • Approve once with "always allow" when Codex asks.
  • Save reusable prefix rules for common commands.
  • Typical prefix: ["xcrun", "simctl", "list", "devices", "--json"]
  • Typical prefix: ["idb", "list-targets"]
  • Typical prefix: ["idb", "list-apps", "--udid", "<SIMULATOR_UDID>"]

Codex may still prompt for new or higher-risk command patterns.

Quick Agent Workflow

1) get_screen_summary()
2) find_elements("Sign In")
3) tap_text("Email")
4) type_text("[email protected]")
5) tap_text("Password")
6) type_text("••••••••")
7) tap_text("Sign In")
8) get_screen_summary()

This keeps actions grounded in visible state, not assumptions.

Local Development

npm run build
npm start

Main implementation lives in:

  • src/index.ts

Reliability Notes

  • If udid is omitted, tools default to the currently booted simulator.
  • tap_text and find_elements rely on accessibility labels/values/hints.
  • Better accessibility metadata in your app means better AI performance.
  • If no simulator is booted, the server returns a clear MCP error.

Troubleshooting

  • No iOS simulator is currently running: boot one via Simulator or call boot_simulator.
  • idb command failures: verify idb/idb-companion installation and PATH.
  • Empty or weak element matches: improve app accessibility labels/semantics.

License

MIT