@iflow-mcp/agentify-sh-desktop
v0.1.3
Published
Agentify Desktop is a local-first control center for AI work: connect your real, logged-in AI subscriptions to your MCP-compatible CLI tools, all on your own machine.
Readme
Agentify Desktop
Agentify Desktop is a local-first control center for AI work: connect your real, logged-in AI subscriptions to your MCP-compatible CLI tools, all on your own machine.
Why teams keep it open
🌐Real browser sessions, real accounts: automate the web UIs you already use, without API-key migration.🔌MCP-native integration: works with Codex, Claude Code, OpenCode, and other MCP-capable clients.🧵Parallel tabs for parallel work: run multiple isolated workflows at once using stable tab keys.📎Practical I/O support: upload files and download generated images from assistant responses.
Supported sites
Supported
chatgpt.comperplexity.aiclaude.aiaistudio.google.comgemini.google.comgrok.com
Planned
- Additional vendor profiles via
vendors.json+ selector overrides.
CAPTCHA policy (human-in-the-loop)
Agentify Desktop does not attempt to bypass CAPTCHAs or use third-party solvers. If a human verification appears, the app pauses automation, brings the relevant window to the front, and waits for you to complete the check manually.
Requirements
- Node.js 20+ (22 recommended)
- MCP-capable CLI (optional, for MCP): Codex, Claude Code, or OpenCode
Quickstart (macOS/Linux)
Quickstart installs dependencies, auto-registers the MCP server for installed clients (Codex/Claude Code/OpenCode), and starts Agentify Desktop:
git clone [email protected]:agentify-sh/desktop.git
cd desktop
./scripts/quickstart.shDebug-friendly: show newly-created tab windows by default:
./scripts/quickstart.sh --show-tabsForeground mode (logs to your terminal, Ctrl+C to stop):
./scripts/quickstart.sh --foregroundChoose MCP registration target explicitly:
./scripts/quickstart.sh --client auto # default
./scripts/quickstart.sh --client codex
./scripts/quickstart.sh --client claude
./scripts/quickstart.sh --client opencode
./scripts/quickstart.sh --client all
./scripts/quickstart.sh --client noneManual install & run
npm i
npm run startThe Agentify Control Center opens. Use it to:
- Show/hide tabs (each tab is a separate window)
- Create tabs for ChatGPT, Perplexity, Claude, Google AI Studio, Gemini, and Grok
- Tune automation safety limits (governor)
- Manage the optional “single-chat emulator” orchestrator
Sign in to your target vendor in the tab window.
If your account uses SSO (Google/Microsoft/Apple), keep Settings → Allow auth popups enabled in the Control Center. ChatGPT login often opens provider auth in a popup, and blocking popups can prevent login from completing.
Connect from MCP clients
Quickstart can register MCP automatically, but manual commands are below if you prefer explicit setup.
Codex
From the repo root:
codex mcp add agentify-desktop -- node mcp-server.mjs [--show-tabs]From anywhere (absolute path):
codex mcp add agentify-desktop -- node /ABS/PATH/TO/desktop/mcp-server.mjs [--show-tabs]Confirm registration:
codex mcp listClaude Code
From the repo root:
claude mcp add --transport stdio agentify-desktop -- node mcp-server.mjs [--show-tabs]From anywhere (absolute path):
claude mcp add --transport stdio agentify-desktop -- node /ABS/PATH/TO/desktop/mcp-server.mjs [--show-tabs]Confirm registration:
claude mcp listOpenCode
OpenCode can be configured in ~/.config/opencode/opencode.json:
{
"mcp": {
"agentify-desktop": {
"type": "local",
"command": ["node", "/ABS/PATH/TO/desktop/mcp-server.mjs"],
"enabled": true
}
}
}./scripts/quickstart.sh --client opencode (or --client all) writes/updates this entry automatically.
Confirm registration:
opencode mcp listIf you already had your client open, restart it (or start a new session) so it reloads MCP server config.
Developer workflows (natural language)
Use plain requests in your MCP client. You usually do not need to call tool IDs directly.
Plan in ChatGPT Pro or Gemini Deep Think, then execute in phases. Prompt: "Open a Gemini tab with key
plan-auth-v2, ask Deep Think for a migration plan from session cookies to JWT in this repo, and return a 10-step checklist with risk and rollback per step." Follow-up: "Now use keyplan-auth-v2and generate step 1 implementation only, including tests."Prompt all vendors and compare output quality before coding. Prompt: "Create tabs for keys
cmp-chatgpt,cmp-claude,cmp-gemini, andcmp-perplexity. Send the same architecture prompt to each. Then compare responses in a table by correctness, operational risk, implementation complexity, and testability."Run incident triage with attached evidence. Prompt: "Open key
incident-prod-api, send./incident/error.logand./incident/dashboard.png, and produce: likely root cause, 30-minute hotfix plan, rollback, and validation checklist."
Use explicit tool calls (agentify_query, agentify_read_page, etc.) when you need deterministic/reproducible runs or when debugging tool selection.
How to use (practical)
- Use ChatGPT/Perplexity/Claude/AI Studio/Gemini/Grok normally (manual): write a plan/spec in the UI, then in your MCP client call
agentify_read_pageto pull the transcript into your workflow. - Drive ChatGPT/Perplexity/Claude/AI Studio/Gemini/Grok from your MCP client: call
agentify_ensure_ready, thenagentify_querywith aprompt. Use a stablekeyper project to keep parallel jobs isolated. - Parallel jobs: create/ensure a tab per project with
agentify_tab_create(key: ...), then use thatkeyforagentify_query,agentify_read_page, andagentify_download_images. - Upload files: pass local paths via
attachmentstoagentify_query(best-effort; depends on the site UI). - Generate/download images: ask for images via
agentify_query(then callagentify_download_images), or useagentify_image_gen(prompt + download).
Real-world prompt example
Example agentify_query input:
{
"key": "incident-triage-prod-api",
"prompt": "You are my senior incident engineer. I attached a production error log and a screenshot from our monitoring dashboard.\\n\\nGoal: produce a high-confidence triage summary and a safe hotfix plan I can execute in 30 minutes.\\n\\nRequirements:\\n1) Identify the most likely root cause with evidence from the log lines.\\n2) List top 3 hypotheses and how to falsify each quickly.\\n3) Give a step-by-step hotfix plan with exact commands.\\n4) Include rollback steps and post-fix validation checks.\\n5) Keep response concise and actionable.\\n\\nReturn format:\\n- Root cause\\n- Evidence\\n- 30-minute hotfix plan\\n- Rollback\\n- Validation checklist",
"attachments": [
"./incident/error.log",
"./incident/dashboard.png"
],
"timeoutMs": 600000
}What's new
- First-class multi-vendor tab support now includes Perplexity, Claude, Google AI Studio, Gemini, and Grok.
- Control Center reliability and UX were hardened (state/refresh wiring, tab actions, compact controls, clearer field guidance).
- Local API hardening includes strict invalid JSON handling, key/vendor mismatch protection, and safer tab-key recovery.
- Desktop runtime hardening includes Control Center sandboxing plus dependency security updates.
Governor (anti-spam)
Agentify Desktop includes a built-in governor to reduce accidental high-rate automation:
- Limits concurrent in-flight queries
- Limits queries per minute (token bucket)
- Enforces minimum gaps between queries (per tab + globally)
You can adjust these limits in the Control Center after acknowledging the disclaimer.
Single-chat emulator (experimental)
Agentify Desktop can optionally run a local “orchestrator” that watches a ChatGPT thread for fenced JSON tool requests, runs Codex locally, and posts results back into the same ChatGPT thread. This gives you a “single-chat” orchestration feel without relying on ChatGPT’s built-in tools/MCP mode.
The orchestrator currently invokes Codex CLI directly. Core agentify_* MCP tools remain client-agnostic.
What it does
- Treats your ChatGPT Web thread as the “mothership” (planning + context).
- Watches for tool requests you paste as fenced JSON blocks.
- Runs Codex CLI locally in your workspace (interactive or non-interactive).
- Posts back: a short outcome + a bounded diff/review packet (so you’re not pasting 200k+ chars every time).
Quick test (recommended)
- Start the app and sign in:
- Run
./scripts/quickstart.sh --show-tabs - In the Control Center, click Show default and sign in to
https://chatgpt.com
- Start an orchestrator session:
- In the Control Center → Orchestrator, start an orchestrator for a project
key(one key per project/workstream).
- In the ChatGPT thread (same tab/key), paste a fenced JSON request like:
{
"tool": "codex.run",
"mode": "interactive",
"args": {
"prompt": "Find the README file and add a short troubleshooting section. Then run tests."
}
}- Wait for the orchestrator to post results back into the thread.
Tips
- Use one stable
keyper project so parallel jobs don’t mix. - If the orchestrator can’t find the right workspace root, set it in the Control Center (Workspace/Allowlist), then retry.
- If you want the orchestrator to post less frequently, keep prompts focused (it posts progress updates on a timer).
Limitations / robustness notes
- File upload selectors:
input[type=file]selection is best-effort; if ChatGPT changes the upload flow, updateselectors.jsonor~/.agentify-desktop/selectors.override.json. - Perplexity selectors: Perplexity support is best-effort and may require selector overrides in
~/.agentify-desktop/selectors.override.jsonif UI changes. - Gemini selectors: Gemini support is best-effort and may require selector overrides in
~/.agentify-desktop/selectors.override.jsonif UI changes. - Completion detection: waiting for “stop generating” to disappear + text stability works well, but can mis-detect on very long outputs or intermittent streaming pauses.
- Image downloads: prefers
<img>elements in the latest assistant message; some UI modes may render images via nonstandard elements. - Parallelism model: “tabs” are separate windows; they can run in parallel without stealing focus unless a human check is required.
- Security knobs: default is loopback-only + bearer token; token rotation and shutdown are supported via MCP tools.
Login troubleshooting (Google SSO)
- Symptom: login shows “This browser or app may not be secure” or the flow never completes.
- Check 1: In Control Center, enable
Allow auth popups (needed for Google/Microsoft/Apple SSO). - Check 2: Retry login from a fresh ChatGPT tab (
Create tab→ChatGPT→Show). - Check 3: If your provider asks for WebAuthn/security key prompts, complete/cancel once and continue; some providers require that step before password/passkey fallback.
Build installers (unsigned)
npm run distArtifacts land in dist/.
Security and data
- Control API binds to
127.0.0.1on an ephemeral port by default. - Auth uses a local bearer token stored under
~/.agentify-desktop/. - Electron session data (cookies/local storage) is stored under
~/.agentify-desktop/electron-user-data/.
See SECURITY.md.
Trademarks
Forks/derivatives may not use Agentify branding. See TRADEMARKS.md.
