@zhihand/openclaw
v0.9.15
Published
OpenClaw host adapter for the ZhiHand control model
Downloads
2,281
Maintainers
Readme
ZhiHand OpenClaw Adapter
This package provides the public OpenClaw-side adapter for ZhiHand.
It is a thin plugin layer on top of the shared ZhiHand control-plane contract.
What It Does
- registers one OpenClaw host instance with the deployment control plane
- creates QR-based pairing sessions for the ZhiHand mobile app
- stores pairing state under the OpenClaw state directory
- fetches the latest uploaded phone screen snapshot
- sends control commands and waits for command ACK status
Install And First Run
The shortest working setup on a fresh OpenClaw host is:
openclaw plugins install @zhihand/openclaw
openclaw config set plugins.allow '["openclaw"]' --strict-json
openclaw config set tools.allow '["openclaw"]' --strict-json
openclaw doctor --generate-gateway-token
export ZHIHAND_GATEWAY_TOKEN="$(python3 - <<'PY'
import json
from pathlib import Path
config = json.loads((Path.home() / '.openclaw' / 'openclaw.json').read_text())
print(config['gateway']['auth']['token'])
PY
)"
openclaw config set gateway.http.endpoints.responses.enabled true --strict-json
openclaw config set plugins.entries.openclaw.config.gatewayAuthToken "\"$ZHIHAND_GATEWAY_TOKEN\"" --strict-jsonThen restart or reload OpenClaw if your deployment requires it.
Why these steps matter:
plugins.allowtrusts the plugin id so OpenClaw will load the extension without the fresh-install warning.tools.allowenables ZhiHand's optional plugin tools for the agent runtime. Without it, mobile chat can answer text but cannot callzhihand_statusorzhihand_control.gateway.http.endpoints.responses.enabledturns on the local OpenClawPOST /v1/responsesroute. Without it, the plugin can load and pair, but mobile prompts fail withOpenClaw /v1/responses returned 404.gatewayAuthTokenis required for the plugin's native relay into the local OpenClawPOST /v1/responsesendpoint.- without
gatewayAuthToken, the plugin loads but logsZhiHand prompt relay disabled... gatewayAuthTokenand mobile prompts do not reach the local runtime.
If you already know your gateway token, you can set it directly:
openclaw config set plugins.entries.openclaw.config.gatewayAuthToken '"your-gateway-token"' --strict-jsonIf you prefer pinned installs for supply-chain stability on a first install, or after deleting the existing extension directory for a reinstall, install an exact published version:
openclaw plugins install @zhihand/openclaw@<version>Development fallback from a local checkout:
openclaw plugins install --link /path/to/zhihand/packages/host-adapters/openclawRecommended discovery paths after npm publication:
- package README
- OpenClawDir or another community plugin directory
- external catalogs when the host deployment supports them
Do not assume a first-party plugin store UI is the only distribution path.
Expected Warnings
These warnings are normal during setup and tell you what is still missing:
plugins.allow is emptyRunopenclaw config set plugins.allow '["openclaw"]' --strict-json.ZhiHand optional tools are not enabled for OpenClaw agentRunopenclaw config set tools.allow '["openclaw"]' --strict-json, or addtools.allow: ["openclaw"]to the dedicated mobile agent inagents.list.OpenClaw /v1/responses returned 404Runopenclaw config set gateway.http.endpoints.responses.enabled true --strict-json, then restart the gateway.ZhiHand prompt relay disabled ... gatewayAuthTokenSetplugins.entries.openclaw.config.gatewayAuthTokento your current OpenClaw gateway token.
These are OpenClaw deployment warnings, not ZhiHand plugin install failures:
gateway.trusted_proxies_missingorigin not allowed- Control UI browser pairing prompts
Minimal plugin config example:
{
"plugins": {
"allow": ["openclaw"],
"entries": {
"openclaw": {
"enabled": true,
"config": {
"gatewayAuthToken": "set-this-to-your-openclaw-gateway-token"
}
}
}
}
}What is not a plugin prerequisite:
- Control UI auth mode choices such as password vs token
gateway.controlUi.allowedOrigins- browser device pairing / Control UI login
Those belong to the OpenClaw gateway deployment itself. ZhiHand only needs the
current gateway token value for plugins.entries.openclaw.config.gatewayAuthToken;
it does not require you to set up the Control UI, browser pairing, or allowed
origins before the plugin can load and relay prompts.
OpenClaw Plugin Config
The plugin reads its config from:
plugins.entries.openclaw.config
Supported fields:
controlPlaneEndpointoriginListenerdisplayNamestableIdentitypairingTTLSecondsappDownloadURLgatewayResponsesEndpointgatewayAuthTokenmobileAgentIdupdateCheckEnabledupdateCheckIntervalHoursrequestedScopes
Normal hosted deployments can leave most fields empty.
Recommended minimum:
- only
gatewayAuthToken
Example:
{
"plugins": {
"allow": ["openclaw"],
"entries": {
"openclaw": {
"enabled": true,
"config": {
"gatewayAuthToken": "set-this-in-deployment"
}
}
}
}
}CLI equivalent for the allowlist and plugin token steps:
openclaw config set plugins.allow '["openclaw"]' --strict-json
openclaw config set tools.allow '["openclaw"]' --strict-json
openclaw config set plugins.entries.openclaw.config.gatewayAuthToken '"your-gateway-token"' --strict-jsonAdvanced self-host example:
{
"plugins": {
"allow": ["openclaw"],
"entries": {
"openclaw": {
"enabled": true,
"config": {
"controlPlaneEndpoint": "https://api.example.com",
"originListener": "https://host.example.zhihand.com",
"displayName": "ZhiHand @ example-host",
"stableIdentity": "openclaw-zhihand:example-host",
"pairingTTLSeconds": 600,
"appDownloadURL": "https://zhihand.com/download",
"gatewayResponsesEndpoint": "http://127.0.0.1:18789/v1/responses",
"gatewayAuthToken": "set-this-in-deployment",
"mobileAgentId": "zhihand-mobile",
"requestedScopes": [
"observe",
"session.control",
"screen.read",
"screen.capture",
"ble.control"
]
}
}
}
}
}Defaults:
controlPlaneEndpoint:https://api.zhihand.compairingTTLSeconds:600appDownloadURL:https://zhihand.com/downloadgatewayResponsesEndpoint:http://127.0.0.1:18789/v1/responsesmobileAgentId:zhihand-mobileupdateCheckEnabled:trueupdateCheckIntervalHours:24requestedScopes: recommended ZhiHand defaultsstableIdentity: auto-generated from hostnameoriginListener: optional; the control plane can fill a default host metadata value
Do not store secrets in this package or this public repository.
Best Practice
Use a dedicated OpenClaw agent/runtime path for ZhiHand mobile prompts.
- normal chat and phone-operation requests should use the same OpenClaw agent
- the plugin should stay thin and only provide pairing, tools, and relay glue
zhihand_*tools should be registered as optional and enabled only for the dedicated mobile agent- do not reintroduce a plugin-owned planner loop or direct
codex execorchestration inside this public plugin
Recommended deployment shape:
{
"agents": {
"list": [
{
"id": "zhihand-mobile",
"model": "openai-codex/gpt-5.4",
"tools": {
"allow": ["openclaw"]
}
}
]
}
}Why this is the preferred path:
- official OpenClaw plugin docs expect tools to be exposed to the agent runtime
- official OpenClaw CLI backend docs treat
codex-cli/*as text-only fallback paths where tools are disabled - keeping the planner inside the native runtime preserves gateway policy, auditability, and tool scoping
Deployment requirements for the native runtime path:
- the OpenClaw gateway must expose local
POST /v1/responses - the deployment must provide a gateway bearer token to the plugin
- the dedicated ZhiHand mobile agent must use a tool-capable provider model
such as
openai-codex/gpt-5.4, notcodex-cli/* - if these native-runtime prerequisites are missing, the prompt relay stays disabled and logs the configuration error during startup
OpenAI Computer Tool Status
openai-codex/gpt-5.4 is still the recommended model for the ZhiHand mobile
agent, but the current OpenClaw relay path does not expose OpenAI's native
tools: [{ "type": "computer" }] workflow.
Current behavior:
- ZhiHand sends mobile prompts into local OpenClaw
POST /v1/responses - OpenClaw's hosted-tool surface currently accepts function tools only
- the mobile agent therefore uses
zhihand_screen_readandzhihand_control, not OpenAIcomputer_call/computer_call_output
Implication:
- you can use
openai-codex/gpt-5.4for better reasoning and screenshot understanding - you cannot assume OpenClaw will automatically switch to OpenAI's native computer tool loop
Using the GA OpenAI computer tool would require either:
- upstream OpenClaw support for
computer/computer_call_output, or - a separate direct-to-OpenAI harness that bypasses local OpenClaw
/v1/responses
That direct harness is intentionally not the public ZhiHand/OpenClaw contract today.
Release Shape
Recommended first public release:
- mobile app
- hosted
pair.zhihand.comandapi.zhihand.com - npm-published OpenClaw plugin
For non-OpenClaw hosts, publish additional thin adapters on top of the same control-plane contract instead of growing this package into a multi-host shell.
Slash Commands
/zhihand pair/zhihand status/zhihand unpair/zhihand update/zhihand update check
/zhihand pair returns a browser-first pairing summary:
- app download URL
- QR URL
Open the QR URL in a browser to display the actual scannable QR page.
Plugin update behavior:
- on startup, the plugin checks npm for a newer published version by default
/zhihand update checkforces a fresh version lookup and prints the result/zhihand updateprints the recommended host-side update command- the preferred host-side update command is
openclaw plugins update openclaw
Recommended host-side update command:
openclaw plugins update openclawFor an installed plugin, upgrade with openclaw plugins update openclaw. Reserve openclaw plugins install @zhihand/openclaw@<version> for a first install or a reinstall after removal.
The current hosted control path is:
- HTTP requests for pairing, uploads, acknowledgements, and control writes
- SSE downlink for prompt, reply, and command events
- per-device profile snapshots so the host can adapt behavior by runtime family
Tools
zhihand_pairzhihand_statuszhihand_screen_readzhihand_control
zhihand_control supports:
clicklong_clickmovemove_toswipebackhomeenterinput_textopen_appset_clipboardstart_live_capturestop_live_capture
Coordinate rules:
click,long_click, andmove_tousexRatioandyRatioin[0,1]from the latest screenshot.swipeusesx1Ratio,y1Ratio,x2Ratio, andy2Ratioin[0,1].moveusesdxRatioanddyRatioin[-1,1]for relative pointer deltas.- Do not send raw screenshot pixel coordinates through the public tool API.
zhihand_screen_readshould be treated as fresh-only visual state. If the latest uploaded snapshot is stale, the tool fails instead of letting the agent click from an old frame.- When a keyboard is visible and the goal is to submit search, send, or confirm
text, prefer
enterover clicking the IME action button. input_textsupportsmode:auto: current default, resolved on the mobile runtime aspastepaste: clipboard-first plus HID paste shortcuttype: raw HID keyboard typing, reserved for sensitive fields or when paste fails
input_textalso supportssubmit=trueto send Enter immediately after the text input completes.autoandpasteoverwrite the mobile runtime clipboard as part of the reliability trade-off. Usetypefor sensitive fields or when clipboard mutation is not acceptable.
State Files
Relative to the OpenClaw state directory:
plugins/openclaw/state.jsonstored pairing state for the host instanceplugins/openclaw/latest-screen.jpglast fetched screen snapshot cache
The adapter may automatically advance local pairing state to the latest claimed session for the same host edge when the stored pairing becomes stale. This is a host-side recovery path and does not change the public QR claim flow.
Pairing Flow
- The host registers itself against the control plane.
- The plugin creates a pairing session and pair URL.
- The pair URL is the canonical QR landing page; browsers render a scannable HTML page, while the mobile app resolves the same URL in JSON mode.
- The mobile app scans the QR code and claims the pairing session.
- The control plane returns a long-lived mobile credential.
- OpenClaw can then use
zhihand_status,zhihand_screen_read, andzhihand_control. - If the phone later claims a newer pairing session for the same host edge, the adapter can recover forward to that latest claimed session instead of staying pinned to an older local credential.
Mobile Prompt Path
The supported runtime path is:
- The mobile app uploads a prompt to the control plane.
- The mobile app may also upload prompt attachments before the prompt itself.
- The OpenClaw plugin polls pending prompts.
- The plugin downloads any prompt attachments from the control plane.
- The plugin prepares multimodal native-agent input:
- images become
input_image - supported documents become
input_file - audio attachments are transcribed into text context
- video attachments stay limited context and may use preview images
- images become
- The plugin forwards the prepared prompt to the local OpenClaw
POST /v1/responsesendpoint for the dedicated mobile agent. - The dedicated mobile agent decides whether to answer directly or call
zhihand_status,zhihand_screen_read, andzhihand_control. - The plugin writes the final assistant reply back to the control plane.
Task cancellation also uses this same path:
- If the mobile app marks the active prompt as
cancelled, the plugin aborts the in-flight native mobile-agent run. - The final reply for that prompt becomes a system message indicating that the user stopped the task.
Capture Constraint
zhihand_screen_read returns the latest uploaded snapshot, not a live video
stream.
start_live_capture may return a permission-required result until the mobile app
app already has an active screen-capture session.
Attachment Best Practice
Preferred handling:
- images and documents remain raw attachments
- voice notes remain raw audio attachments and are transcribed on the host
- The mobile app should not treat app-local speech-to-text as the canonical contract
- video support is intentionally conservative and should be treated as limited context until the deployment adds explicit video understanding
