@charzhu/openjaw-agent
v0.3.2
Published
OpenJaw Agent — Autonomous desktop AI assistant for the terminal. Rich Ink TUI, 100+ tools, multi-channel bridges (Telegram, Feishu, Teams, WeChat). Standalone, no MCP server required.
Maintainers
Readme
What is OpenJaw Agent?
OpenJaw Agent is a standalone AI assistant that runs in your Windows terminal and can automate your entire desktop:
- 📧 Email — Read, compose, reply, forward, search (Outlook via COM)
- 💬 Teams — Send messages, read chats, monitor conversations (UIA automation)
- 💬 WeChat — Send messages, read chats via official iLink Bot API
- 🌐 Browser — Navigate, click, type, screenshot, extract content (Chrome CDP)
- 📄 Office — Word, Excel, PowerPoint automation (COM)
- 📁 Files — Read, write, edit, search, list, delete
- 🖥️ System — Run commands, clipboard, notifications, web search
- 🧠 Memory — Persistent hybrid-search memory, shared with MCP mode
- 🔌 MCP — Auto-discovers and connects to external MCP servers
- 🗣️ Voice — Text-to-speech (edge-tts) and speech-to-text input
- 🎓 Skills — 19 bundled skills for email drafting, research, document creation, and more
It connects to Claude or GPT via your proxy or API key, reasons about your request, picks the right tools, and executes them autonomously.
Multi-Channel Access
Beyond the terminal, the agent can be accessed through messaging bridges — all running the same agent loop with full tool access:
| Channel | Flag | How It Works |
|---------|------|-------------|
| Terminal | (default) | New TUI built on React + vendored @openjaw/ink, with streaming, status bars, tool progress, and model/session controls |
| Telegram | --telegram | Long-polling bot — text, voice, photos. No public IP needed |
| Telegram headless | --telegram --headless | Telegram-only, no terminal UI |
| Teams | --teams | Self-chat bridge via Graph API. No bot registration needed |
| Feishu (Lark) | --feishu | WebSocket events via official SDK |
| WeChat | --wechat | iLink Bot API — QR login from iOS WeChat |
| Legacy Ink UI | --legacy-ui | Previous Ink UI (pre-rewrite) for A/B comparison |
| Legacy REPL | --legacy | Simple readline REPL |
Bridges run alongside the terminal UI (hybrid mode) or standalone (headless).
Terminal UI Architecture
The default terminal experience is the new TUI rewrite. src/bootstrap.ts initializes the in-process AgentLoop, tools, MCP, memory, voice, and bridge services; src/eventBridge.ts translates agent chunks into typed GatewayEvent objects; and src/agentBus.ts exposes a hermes-compatible GatewayClient event/RPC bus. src/rpcHandlers.ts serves the ported hermes hooks (useMainApp, useSessionLifecycle, useSubmission, and related hooks), while nanostores under src/app/* hold UI state for turns, overlays, selections, and delegation. React components in src/components/* render through the vendored @openjaw/ink renderer, a fork of hermes ink; see packages/openjaw-ink/VENDOR.md for provenance.
Use the default command for the new TUI, --legacy-ui to A/B test against the previous Ink UI, or --legacy for the readline REPL. See docs/TUI.md for a concise map of the UI rewrite.
Quick Start
Prerequisites
- Node.js 22.5+ — Download. Uses the built-in
node:sqlitemodule. Node 23.5+ recommended (its bundled SQLite includes FTS5 for faster memory search; older builds fall back to a LIKE-based scan). - Windows 10/11 recommended (some automation tools require Windows; cross-platform basics work elsewhere)
- LLM access — Maestro proxy, Anthropic/OpenAI API keys, or GitHub Copilot via
/connect
Install from npm (recommended)
# Global install — exposes the `openjaw-agent` command on PATH
npm install -g @charzhu/openjaw-agent
openjaw-agent
# Or run on demand without installing
npx @charzhu/openjaw-agentThe package ships a single bundled dist/main.js (no postinstall build step). After install, run /connect to pick a provider, then /model to pick a model.
Install from source (for contributors)
git clone https://github.com/charzhu/openjaw.git
cd openjaw/openjaw-agent
# One-click install (builds openjaw core + agent, configures proxy)
install.bat
# Or manually:
cd ../openjaw-mcp && npm install && npm run build # build core first
cd ../openjaw-agent && npm install && npm run buildConfigure
For normal interactive use, you do not need to edit provider keys in YAML. Launch OpenJaw Agent, run /connect to pick Maestro, Anthropic, OpenAI, or GitHub Copilot, then run /model to choose from connected providers. /connect stores direct provider credentials in ~/.openjaw-agent/auth.json and each provider context starts with a sensible default model.
config.yaml in the package directory, or ~/.openjaw-agent/config.yaml after first run, is mainly for advanced defaults and non-interactive deployments:
llm:
provider: anthropic # updated by /connect
model: claude-sonnet-4-20250514 # updated by /model
api_key: proxy-token # placeholder; direct keys are stored by /connect
base_url: http://localhost:23333/api/anthropic # default Maestro proxy URL
max_tokens: 16384
temperature: 0.7Advanced OpenAI-compatible proxy tuning is still available with openai_tool_mode: auto | compact | full and openai_max_tools, but it is not part of the normal setup flow.
llm:
provider: openai
model: gpt-5.4
api_key: proxy-token # actual key is read from ~/.openjaw-agent/auth.json
max_tokens: 16384
temperature: 0.7For local proxy mode through Agent Maestro (http://localhost:23333/api/anthropic or http://localhost:23333/api/openai), use /connect maestro, then /model to choose a Maestro model. Maestro model selections use the local proxy and do not use Anthropic/OpenAI API keys stored by /connect. For OpenAI-compatible proxy requests, the default openai_tool_mode: auto sends a compact subset of tools instead of the full registry. All tools remain executable locally; the model can request additional tools with openjaw_load_tools. Use openai_tool_mode: full only for endpoints that reliably handle the full tool inventory.
llm:
provider: anthropic
model: claude-sonnet-4-20250514
api_key: proxy-token # actual key is read from ~/.openjaw-agent/auth.json
max_tokens: 16384
temperature: 0.7telegram:
token: "YOUR_BOT_TOKEN" # Get from @BotFather
allowed_users: [YOUR_USER_ID] # Get from @userinfobotfeishu:
app_id: "YOUR_APP_ID"
app_secret: "YOUR_APP_SECRET"
allowed_users: ["open_id_1"] # Optional allowlistRun
# Launch with the new TUI (default — React + @openjaw/ink)
openjawagent.bat
node dist/main.js
# A/B test against the previous Ink UI (pre-rewrite)
node dist/main.js --legacy-ui
# Launch with messaging bridges
node dist/main.js --telegram # New TUI + Telegram
node dist/main.js --telegram --headless # Telegram only
node dist/main.js --teams # New TUI + Teams self-chat
node dist/main.js --feishu # New TUI + Feishu
node dist/main.js --wechat # New TUI + WeChat
# Launch with legacy readline REPL
node dist/main.js --legacyUsage
Chat Naturally
Just type your request:
❯ read my latest emails and summarize them
❯ send a teams message to PM Team Meeting saying "heading into meeting"
❯ what did Chenlu say in Teams this week?
❯ open the report in Excel and add a SUM formula in B10The agent reasons about your request, picks the right tools, executes them, and shows results with real-time feedback.
Commands
| Command | Description |
|---------|-------------|
| /help | Show all commands |
| /model | Switch LLM provider/model (interactive picker) |
| /connect | Connect/disconnect/list LLM provider contexts |
| /schedule "prompt" every 15m | Schedule recurring tasks |
| /schedule list | Show active scheduled tasks |
| /schedule stop <id> | Cancel a scheduled task |
| /workflow <goal> | Start an advisory dynamic workflow with read-only workers |
| /workflow status [id] | Open navigable worker status and detail view |
| /tools | List all available tools |
| /mcp | Manage MCP server connections |
| /memory | Search persistent memory |
| /voice | Toggle voice input/output |
| /clear | Clear conversation history (new session) |
| /exit | Quit the agent |
Switch Models at Runtime
❯ /model
Current: anthropic/claude-sonnet-4-20250514
? Select model (↑↓ to navigate, Enter to select):
❯ anthropic/claude-opus-4.6
anthropic/claude-sonnet-4.6
anthropic/claude-sonnet-4-20250514
anthropic/claude-haiku-4.5
Maestro/gpt-5.4
Maestro/gpt-4.1
.../model shows models from connected provider contexts only. If Maestro is connected, models are grouped under Maestro but still select the underlying anthropic or openai provider so the correct local proxy endpoint is used. Selection is saved to config and persists across restarts.
Connect Providers
Use /connect after launch to pick the active provider context. Direct provider credentials are stored locally under ~/.openjaw-agent/auth.json; secrets are not written to the repo config or shown in status output. Selecting a provider also applies its default model, and /model can change it afterward.
For direct Anthropic/OpenAI usage, run:
❯ /connect anthropic <api-key>
❯ /connect openai <api-key>The command stores the API key locally and switches the active provider context. Local Maestro proxy mode does not require these keys:
❯ /connect maestroOpenJaw Agent also supports GitHub Copilot as a first-class provider. The bundled Copilot client ID currently uses opencode's OAuth app because testing showed it exposes the full Copilot model list. You can override it with your organization's OAuth App client ID if needed.
llm:
provider: github-copilot
model: gpt-5.4
api_key: proxy-token
copilot_oauth_client_id: Iv1.b507a08c87ecfe98You can also override it with GITHUB_COPILOT_CLIENT_ID in the environment.
❯ /connect github-copilotThe command prints a GitHub device-login URL and one-time code, waits for authorization, then stores the credential. For GitHub Enterprise:
❯ /connect github-copilot enterprise company.ghe.comThen switch models:
❯ /model github-copilot gpt-5.4Useful credential commands:
| Command | Description |
|---------|-------------|
| /connect | Show connected provider contexts and switch options |
| /connect maestro | Use local Maestro proxy with no API key |
| /connect anthropic <api-key> | Store an Anthropic API key and select Anthropic |
| /connect openai <api-key> | Store an OpenAI API key and select OpenAI |
| /connect list | List connected contexts and stored credentials without showing secrets |
| /connect status | Show credential store path and provider status |
| /connect disconnect | Pick a connected provider, including Maestro, to disconnect |
Copilot model discovery uses GitHub Copilot's /models endpoint when connected and falls back to a static catalog when offline. GPT models use Copilot's OpenAI-compatible endpoints; Copilot Claude models exposed through /v1/messages use the Anthropic-compatible shim.
Schedule Recurring Tasks
❯ /schedule "check my inbox for urgent emails" every 15m
✓ Scheduled task #1: runs every 15 minutes
❯ /schedule list
#1 every 15m runs: 3 last: 7:35 AM
"check my inbox for urgent emails"
❯ /schedule stop 1
✓ Task #1 stoppedSessions (Resume Conversations)
Sessions auto-save after each turn. Resume anytime:
# List recent sessions
node dist/main.js --sessions
# Resume the most recent session
node dist/main.js --continue # or -c
# Resume a specific session
node dist/main.js --resume a1b2c3d4Skills
19 bundled skills provide multi-step workflows that the agent can invoke:
| Skill | Description |
|-------|-------------|
| daily-briefing | Morning briefing from emails, calendar, Teams |
| deep-research | Multi-source web research with synthesis |
| email-drafting | Compose polished emails |
| email-with-attachment | Send emails with file attachments |
| meeting-summarizer | Summarize meeting transcripts |
| create-docx | Create Word documents |
| create-pptx | Create PowerPoint presentations |
| create-pdf-report | Generate PDF reports |
| data-analysis | Analyze datasets and create charts |
| web-research | Quick web lookups |
| translation | Multi-language translation |
| proofreading | Grammar and style review |
| summarization | Summarize long documents |
| desktop-cleanup | Organize files and desktop |
| doc-coauthoring | Co-author documents with edit tracking |
| internal-comms | Draft internal communications |
| competitive-battlecard | Competitive analysis documents |
| skill-creator | Create new custom skills |
| refresh-token | Refresh auth tokens for bridges |
Custom skills can be added to ~/.openjaw-agent/skills/ as Markdown files. User skills override bundled skills of the same name.
MCP Integration
OpenJaw Agent auto-discovers and connects to external MCP servers from 5 config sources (in priority order):
~/.openjaw-agent/mcp.json— Agent's own config (always trusted).mcp.json+ parent directories — Claude Code project config~/.copilot/mcp-config.json— GitHub Copilot CLI config~/.copilot/installed-plugins/— Copilot marketplace plugins (WorkIQ, etc.).vscode/mcp.json— VS Code config (normalized to standard format)
Supports stdio, SSE, and Streamable HTTP transports. Tools from external servers appear as mcp__<server>__<tool> and can be used alongside built-in tools.
Use /mcp to interactively manage connections — toggle servers, view tools, reconnect.
Dynamic Workflows
Use /workflow <goal> for complex advisory work that benefits from parallel read-only workers. OpenJaw asks the active model to return a JSON workflow graph, dynamically plans worker count from the goal, schedules workers through an adaptive queue, and shows progress in the /workflow status [id] overlay. Use arrow keys or j/k to select workers, Enter/→ for details, Esc/← to return, s to sort, f to filter, and q to close.
Workflows include a final synthesizer worker. When the workflow completes, the synthesizer's answer is posted back into the transcript and is also visible through /workflow show [id] and the worker details view.
Workflow workers are intentionally non-mutating in this version: they can inspect files and search, but cannot edit files, run shell/code execution, send messages, or update memory. Results are persisted under ~/.openjaw-agent/workflows/ and replayable through the workflow status and spawn-tree archive paths.
Architecture
openjaw-agent/
├── src/
│ ├── main.ts # CLI dispatch: new TUI default, --legacy-ui, --legacy
│ ├── entry.tsx # New TUI entry; renders React with @openjaw/ink
│ ├── bootstrap.ts # Initializes AgentLoop, tools, MCP, memory, voice, bridges
│ ├── agentBus.ts # In-process GatewayClient event/RPC bus
│ ├── agentEvents.ts # GatewayEvent type union for the ported UI
│ ├── eventBridge.ts # Converts AgentLoop chunks into GatewayEvents
│ ├── rpcHandlers.ts # Hermes-style RPC handlers for UI hooks
│ ├── app.tsx # New TUI root component
│ ├── app/ # Ported hermes hooks + nanostores UI state
│ │ ├── useMainApp.ts
│ │ ├── useSessionLifecycle.ts
│ │ ├── useSubmission.ts
│ │ └── uiStore.ts # Plus overlay/turn/delegation/input stores
│ ├── components/ # New TUI React components
│ │ ├── appChrome.tsx
│ │ ├── streamingAssistant.tsx
│ │ ├── sessionPicker.tsx
│ │ ├── modelPicker.tsx
│ │ └── todoPanel.tsx
│ ├── legacy-ink-ui.tsx # Previous Ink UI (--legacy-ui)
│ ├── agent-loop.ts # ReAct orchestrator (parallel tools, usage tracking)
│ ├── mcp-client.ts # MCP client — auto-discovers external servers
│ ├── bridges/
│ │ ├── telegram.ts # Telegram bot bridge (long-polling)
│ │ ├── teams.ts # Teams self-chat bridge (Graph API)
│ │ ├── feishu.ts # Feishu/Lark bridge (WebSocket)
│ │ ├── wechat.ts # WeChat iLink bridge (HTTP polling)
│ │ └── format.ts # Platform-specific message formatting
│ ├── voice/
│ │ ├── index.ts # Voice manager (enable/disable, settings)
│ │ ├── tts.ts # Text-to-speech (edge-tts)
│ │ └── stt.ts # Speech-to-text (Windows Speech Recognition)
│ ├── prompts/
│ │ ├── index.ts # Structured prompt assembly (static/dynamic boundary)
│ │ ├── sections.ts # Memoization framework + SYSTEM_PROMPT_DYNAMIC_BOUNDARY
│ │ ├── identity.ts # Static: agent persona
│ │ ├── reasoning.ts # Static: ReAct rules
│ │ ├── safety.ts # Static: safety guardrails
│ │ ├── computerUse.ts # Static: computer use guidelines
│ │ ├── user.ts # Dynamic: user preferences
│ │ ├── memory.ts # Dynamic: persistent memory
│ │ └── context.ts # Dynamic: runtime context
│ ├── providers/
│ │ ├── types.ts # LLMProvider + streaming + usage types
│ │ ├── anthropic.ts # Anthropic (cache_control, streaming, usage)
│ │ ├── openai.ts # OpenAI (Responses + Completions API, usage)
│ │ ├── cache-control.ts # System prompt → cache-scoped blocks
│ │ └── index.ts # Provider factory
│ ├── skills/
│ │ └── registry.ts # Skill discovery (bundled + user overrides)
│ ├── tools/
│ │ └── skill-tool.ts # LLM-callable skill invocation tool
│ ├── utils/
│ │ └── frontmatter.ts # Markdown frontmatter parser for skills
│ ├── cost-tracker.ts # Per-model pricing + session cost tracking
│ ├── context-manager.ts # Token estimation + context window warnings
│ ├── cache-monitor.ts # Prompt cache break detection between turns
│ ├── telemetry.ts # JSONL event logging (~/.openjaw-agent/telemetry/)
│ ├── config.ts # Config loader (~/.openjaw-agent/config.yaml)
│ ├── session.ts # Session persistence + resume
│ ├── scheduler.ts # Recurring task scheduler
│ ├── fork.ts # Background sub-agent spawning
│ ├── pet.ts # Pet companion system (Chinese mythical creatures)
│ ├── computer-use.ts # Anthropic computer use executor
│ ├── clipboard-image.ts # Windows clipboard image reader
│ ├── image-resize.ts # Image processing for vision APIs
│ └── repl.ts # Legacy readline REPL (--legacy mode)
├── prompts/ # Markdown prompt files
│ ├── IDENTITY.md
│ ├── REASONING.md
│ ├── SAFETY.md
│ ├── COMPUTER_USE.md
│ └── USER.md
├── skills/ # Bundled skill definitions (19 skills)
├── docs/
│ └── TUI.md # Concise TUI rewrite overview
├── packages/
│ └── openjaw-ink/ # Vendored @openjaw/ink renderer forked from @hermes/ink
├── config.yaml # Bundled default config
├── install.bat # One-click Windows installer
├── openjawagent.bat # Launch script
├── package.json
└── tsconfig.jsonKey Design Decisions
TUI Rewrite — The default UI is an in-process React TUI rendered by @openjaw/ink. GatewayEvent streaming and RPC handlers isolate AgentLoop/service logic from ported hermes hooks, while nanostores keep UI state small and composable. Use --legacy-ui for the previous Ink UI when comparing behavior.
ReAct Loop — The agent loop implements a Reasoning + Acting pattern: the LLM reasons about the request, selects tools, the loop executes them in parallel, feeds results back, and repeats until done.
Prompt Caching — The system prompt is split into static and dynamic sections separated by a boundary marker. Static sections (identity, reasoning, safety, computer use) get Anthropic's cache_control: { type: 'ephemeral' } for prompt caching. Dynamic sections (user profile, memory, context) are memoized per session. Cache breaks are tracked by CacheMonitor to surface cost spikes.
Tool Permission Model — Sensitive tools (shell commands, file deletion, file writes) trigger an interactive permission dialog. Users can approve once or allow-all for the session.
Fork System — The agent can spawn background sub-agents for parallel work. Forks share the parent's system prompt and tools but run with their own conversation history.
Telemetry — Every agent turn is logged as structured JSONL to ~/.openjaw-agent/telemetry/ with token counts, cost, duration, and cache metrics. Files rotate daily.
Provider Plugin Architecture
Adding a new LLM provider:
- Create
src/providers/yourprovider.tsimplementingLLMProvider - Add
case 'yourprovider':tosrc/providers/index.ts - That's it — config-driven, no other changes needed
Shared Memory
Both OpenJaw Agent and OpenJaw MCP server share the same SQLite database at ~/.openjaw/memory.db:
- Hybrid search: FTS5 keyword + Jaccard + HRR semantic similarity
- Searchable via
/memorycommand ormemory_searchtool
No duplication — write from either mode, search from either mode.
No Conflicts with MCP Server
Both can run simultaneously:
| | MCP Server | Agent |
|-|-----------|-------|
| Transport | stdio (pipe) | terminal (readline/Ink) |
| Ports | none | none |
| Config | ~/.openjaw/config.yaml | ~/.openjaw-agent/config.yaml |
| Memory | ~/.openjaw/memory/ (shared) | ~/.openjaw/memory/ (shared) |
| Sessions | managed by host | ~/.openjaw-agent/sessions/ |
| Telemetry | — | ~/.openjaw-agent/telemetry/ |
Build from Source
cd projects/openjaw-agent
# Install dependencies (includes openjaw core as local dependency)
npm install
# Build TypeScript
npm run build
# Run
node dist/main.js
# Or use the batch file
openjawagent.batDevelopment
# Run in dev mode (tsx, no build needed)
npm run dev
# Rebuild after changes
npm run buildCLI Reference
openjaw-agent Start new session (new TUI)
openjaw-agent --legacy-ui Start previous Ink UI (pre-rewrite) for A/B comparison
openjaw-agent --legacy Start legacy readline REPL
openjaw-agent --resume <session-id> Resume a specific session
openjaw-agent --continue (-c) Resume the most recent session
openjaw-agent --sessions List recent sessions
openjaw-agent --telegram New TUI + Telegram bridge (hybrid)
openjaw-agent --telegram --headless Telegram only (no terminal UI)
openjaw-agent --teams New TUI + Teams self-chat bridge
openjaw-agent --feishu New TUI + Feishu bot bridge
openjaw-agent --wechat New TUI + WeChat iLink bridge
openjaw-agent --help Show helpEnvironment Variables (Optional Overrides)
| Variable | Description |
|----------|-------------|
| ANTHROPIC_API_KEY | Anthropic API key (overrides config) |
| ANTHROPIC_AUTH_TOKEN | Anthropic auth token (for proxy) |
| ANTHROPIC_BASE_URL | Anthropic proxy URL |
| OPENAI_API_KEY | OpenAI API key (overrides config) |
| OPENAI_BASE_URL | OpenAI proxy URL |
Config file values are the default. Environment variables override when set.
License
Internal use only. Part of the BravoPM monorepo.
