computer-control-mcp
v2.1.0
Published
MCP server that allows Claude to control your computer via screenshots and input automation
Downloads
108
Maintainers
Readme
computer-control-mcp
An MCP (Model Context Protocol) server that allows Claude to control your computer using macOS Accessibility APIs for reliable UI automation.
Why Accessibility-Based Control?
Traditional screenshot-based automation is error-prone:
- Screenshots get scaled, making coordinate reading imprecise
- Multiple round-trips needed: screenshot → zoom → guess coords → click → verify
- No semantic understanding of UI elements
This server uses macOS Accessibility APIs instead, exposing the UI as structured data:
- Click elements by name:
ax_click_element({ criteria: { title: "Save" } }) - No coordinate guessing required
- Faster and more reliable
Features
- Accessibility-based interaction - Click, type, and navigate by element name/role
- UI tree inspection - See the entire app hierarchy as JSON
- Element search - Find buttons, text fields, etc. by criteria
- Multi-monitor support - Screenshots across all displays
- Fallback visual tools - Screenshots with grid overlay when needed
Installation
Using npx (recommended)
npx computer-control-mcpGlobal install
npm install -g computer-control-mcp
computer-control-mcpConfiguration
Claude Code
Add to your MCP settings via /mcp command or edit your config:
{
"mcpServers": {
"computer-control": {
"command": "npx",
"args": ["computer-control-mcp"]
}
}
}Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"computer-control": {
"command": "npx",
"args": ["computer-control-mcp"]
}
}
}macOS Permissions Required
The server needs these permissions in System Settings → Privacy & Security:
- Accessibility - Required for UI automation and accessibility tree access
- Screen Recording - Required for screenshots (fallback method)
Available Tools
Accessibility Tools (Recommended)
| Tool | Description |
|------|-------------|
| ax_get_running_apps | List running apps with accessibility info |
| ax_get_ui_tree | Get UI hierarchy as structured JSON |
| ax_find_elements | Search for elements by role/title/criteria |
| ax_click_element | Click by name, not coordinates |
| ax_type_into_element | Focus element and type text |
| ax_activate_app | Bring an app to the front |
| ax_get_focused_element | Get the currently focused element |
| ax_perform_action | Perform any accessibility action |
Visual Tools (Fallback)
| Tool | Description |
|------|-------------|
| get_screens | List all connected monitors |
| take_screenshot | Clean + grid overlay images |
| take_screenshot_clean | Screenshot without grid |
| take_screenshot_grid | Screenshot with grid only |
| zoom_screenshot | Zoom into a region for precise coordinates |
| click | Click at x, y coordinates |
| double_click | Double-click at x, y |
| right_click | Right-click at x, y |
| move_mouse | Move cursor to x, y |
| drag | Drag from one position to another |
| type_text | Type text at cursor position |
| press_key | Press a single key |
| hotkey | Press a keyboard shortcut |
| scroll | Scroll up/down |
| get_mouse_position | Get current cursor position |
| run_actions | Execute multiple actions in sequence |
Usage Examples
Click a Button (Accessibility)
// Old way: 6+ tool calls
take_screenshot() → zoom_screenshot() → read coords → click(523, 847) → verify
// New way: 1 tool call
ax_click_element({
criteria: { role: "AXButton", title: "Save" },
app: "TextEdit"
})Fill a Form
// Type into username field
ax_type_into_element({
criteria: { role: "AXTextField", title_contains: "username" },
text: "myuser"
})
// Type into password field
ax_type_into_element({
criteria: { role: "AXSecureTextField" },
text: "mypassword"
})
// Click submit
ax_click_element({
criteria: { role: "AXButton", title: "Log in" }
})Explore an App's UI
// See what's available in Finder
ax_get_ui_tree({ app: "Finder", max_depth: 4 })
// Find all buttons
ax_find_elements({
criteria: { role: "AXButton" },
app: "Finder"
})Activate an App
// Bring Safari to front
ax_activate_app({ app: "Safari" })
// Or by bundle ID
ax_activate_app({ app: "com.apple.Safari" })Element Criteria
When searching for elements, you can use these criteria:
| Criteria | Description |
|----------|-------------|
| role | Element type: AXButton, AXTextField, AXStaticText, etc. |
| title | Exact title match |
| title_contains | Title substring (case-insensitive) |
| value | Element value (for text fields) |
| value_contains | Value substring |
| description | Accessibility description |
| identifier | Accessibility identifier |
| enabled | Filter by enabled state |
| focused | Filter by focus state |
Common AX Roles
| Role | Description |
|------|-------------|
| AXButton | Buttons |
| AXTextField | Text input fields |
| AXSecureTextField | Password fields |
| AXStaticText | Labels/text |
| AXCheckBox | Checkboxes |
| AXRadioButton | Radio buttons |
| AXPopUpButton | Dropdown menus |
| AXMenuItem | Menu items |
| AXWindow | Windows |
| AXToolbar | Toolbars |
| AXWebArea | Web content |
Multi-Monitor Usage
Get screen info:
get_screens() // Returns: [{ index: 0, width: 1920, height: 1080, is_primary: true }, ...]Screenshot specific screen:
take_screenshot({ screen_index: 0 }) // Main screen take_screenshot({ screen_index: 1 }) // Second screen take_screenshot() // All screensCoordinates are absolute across all screens.
Requirements
- macOS only - Uses macOS-specific Accessibility APIs
- Node.js 18+
- Swift - Required to compile the accessibility helper (included in Xcode Command Line Tools)
Building from Source
git clone https://github.com/your-repo/computer-control-mcp
cd computer-control-mcp
npm install
npm run build # Compiles TypeScript and SwiftLicense
MIT
