lazy-desktop-mcp
v0.1.5
Published
Local-first desktop control MCP server with a Rust host and Node launcher.
Maintainers
Readme
lazy-desktop-mcp
lazy-desktop-mcp is a local-first desktop automation MCP stack with a Rust host process and an npm-distributed launcher.
What Ships
desktop-core: shared types, policy evaluation, audit payload handling, and host wire protocoldesktop-host: local privileged host with audit storage, session handling, screenshot capture, and platform adaptersdesktop-mcp: MCP stdio server that proxies tool calls todesktop-hostlazy-desktop-mcp: Node launcher published to npm so Codex can start the MCP server withnpxor a global install
Security Defaults
The public package is intentionally locked down until the operator configures a host policy file.
desktop.capabilities,desktop.permissions,session.open, andsession.closeare always available- standalone capabilities such as
app.listandobserve.captureare disabled until allowed by host policy - session capabilities such as
app.launchare disabled until allowed by host policy - raw coordinate input is disabled unless explicitly enabled by host policy
- on macOS, out-of-policy app, window, and session-scope requests can trigger a local user approval dialog that persists a target-only allowlist overlay
desktop-mcprefuses to start if it cannot find the expecteddesktop-hostbinary
See SECURITY.md and docs/security-model.md before enabling desktop control features.
Installation
The npm package builds native binaries during postinstall, so the target machine needs:
- Node.js 20+
- Rust and Cargo
Install globally:
npm install -g lazy-desktop-mcpOr run without a global install:
npx -y lazy-desktop-mcpThe published package was smoke-tested from the npm registry with npx -y lazy-desktop-mcp on macOS, including a real MCP initialize handshake.
If you want to skip the install-time build for CI or packaging experiments:
LAZY_DESKTOP_SKIP_POSTINSTALL=1 npm install
npm run build:nativeHost Policy
The host reads a JSON policy file from LAZY_DESKTOP_POLICY_PATH or its local application data directory. Start from the shipped example:
cp config/policy.example.json /path/to/policy.json
export LAZY_DESKTOP_POLICY_PATH=/path/to/policy.jsonExample policy:
{
"allowed_standalone_capabilities": ["app_list", "observe_capture", "ocr_read"],
"allowed_session_capabilities": ["app_launch"],
"allowed_apps": ["TextEdit"],
"allowed_windows": [],
"allowed_screens": ["primary"],
"allow_raw_input": false,
"max_actions_per_minute": 30
}For repeatable local development, this repository also ships:
config/client-config.json: canonical development config source for client wiringconfig/policy.dev.json: generated development policy for interactive desktop workflows
Use the sync script to regenerate policy.dev.json and upsert the matching client entries:
npm run sync:clientsPreview the rendered policy and client config without writing files:
npm run sync:clients:dryThe sync flow defaults to both Codex and OpenCode. Override the target set with LAZY_DESKTOP_CLIENTS=codex, LAZY_DESKTOP_CLIENTS=opencode, or explicit config paths via CODEX_CONFIG_PATH and OPENCODE_CONFIG_PATH.
Runtime Approval Overlay
When the host policy enables a capability class but the requested app, window, or screen target is outside the configured allowlist, the macOS system backend can ask the logged-in user for approval.
- the dialog is local to the target machine and uses the native macOS dialog UI
Allowpersists only the requested target into a localpolicy-overlay.jsonDeny, closing the dialog, or timeout keeps the request blocked- runtime approval never enables a new capability class and never enables raw coordinate input
The overlay file is stored under the host application data directory and merged with the base policy at startup. Delete that overlay file if you need to clear previously approved targets.
Client Setup
The published package can still be registered manually with Codex:
codex mcp add lazy-desktop \
-- npx --prefix ~/.codex/mcp-cache/lazy-desktop-mcp -y lazy-desktop-mcpThe isolated --prefix keeps npm's execution context stable even when Codex is launched from a repository that has the same package name as the published MCP package.
If you need an explicit config entry:
[mcp_servers.lazy-desktop]
command = "npx"
args = ["--prefix", "/absolute/path/to/.codex/mcp-cache/lazy-desktop-mcp", "-y", "lazy-desktop-mcp"]
[mcp_servers.lazy-desktop.env]
LAZY_DESKTOP_POLICY_PATH = "/absolute/path/to/policy.json"If you prefer a fully deterministic local install, npm install -g lazy-desktop-mcp and pointing Codex at the global lazy-desktop-mcp binary also works.
For local repository development, prefer npm run sync:clients; it wires both Codex and OpenCode to the checked-out target/release binaries and the repo-managed development policy.
Desktop App Development
lazy-desktop-mcp is meant to sit above framework-specific desktop stacks.
- Tauri: use it for launch, window targeting, input orchestration, screenshot capture, OCR, and local operator approval loops around a real packaged or dev-run desktop shell
- PyQt: use it for native widget smoke tests, focused regression checks, and screenshot-driven debugging when browser tooling is not available
- keep framework-native tests for deterministic unit/component coverage; use the MCP for end-to-end operator flows and local exploratory validation
The standard local development workflow is:
- Build the native binaries with
npm run build:native - Sync the repo-managed client config with
npm run sync:clients - Grant macOS Accessibility, Automation, and Screen Recording if the backend needs them
- Start the target Tauri or PyQt app
- Verify live availability with
desktop.capabilities,desktop.permissions, anddesktop.runtime - Open a scoped session, then run app/window/input/capture/OCR or vision steps as needed
For interactive flows, prefer the higher-level tools first:
app.activatewhen you want to bring an app to the front without depending on an exact window title- selector-based
window.focususingwindow_id, exacttitle, partialtitle_contains, orapp input.click_targetfor OCR-matched text or window-relative clicks before falling back to raw coordinates
See docs/desktop-app-development.md for a more detailed workflow and troubleshooting notes.
Runtime Availability
The development policy enables app launch, window control, screenshot capture, OCR, and interactive input by default. Vision remains optional and only becomes available when a local vision command is configured. Actual runtime availability still depends on:
- the current backend implementation on the active platform
- local OS permissions such as Accessibility, Automation, and Screen Recording
- optional dependencies such as
tesseract - optional vision command wiring
Use desktop.capabilities, desktop.permissions, and desktop.runtime as the source of truth for the current machine instead of assuming a static capability matrix.
If a capability shows Disabled by the host security policy, inspect desktop.runtime first. It returns the active security_policy_path, overlay path, and the effective host policy so you can immediately tell whether Codex or OpenCode is pointing at the intended development policy. A common local-development mistake is wiring the client to config/policy.example.json instead of the repo-managed config/policy.dev.json; rerun npm run sync:clients and restart the client after rebuilding when that happens.
Local Development
Build native binaries:
npm run build:native
npm run sync:clientsRun the full verification stack:
npm run security
npm run verify
npm run pack:dryPrepare an npm release without publishing yet:
npm run release:prep
npm run release:notes
npm run release:checkThe verification flow runs:
- JavaScript wrapper tests
cargo fmt --checkcargo clippy -D warningscargo testcargo auditnpm pack --dry-run
Publishing
Before npm publish, make sure:
- the version in package.json matches the Rust workspace version in Cargo.toml
- the canonical client config in config/client-config.json still produces the intended development policy
- the policy example still matches the shipped host behavior
- the README and security docs reflect the actual supported capabilities
npm run release:preppasses with a clean worktree and a version that is newer than the latest git tagnpm run release:notesproduces the release note draft you intend to ship
See docs/publishing.md for the release checklist.
