@zhihand/zhihand
v0.4.1
Published
OpenClaw host adapter for the ZhiHand control model
Maintainers
Readme
ZhiHand OpenClaw Adapter
This package provides the public OpenClaw-side adapter for ZhiHand.
It is a thin plugin layer on top of the shared ZhiHand control-plane contract.
What It Does
- registers one OpenClaw host instance with the deployment control plane
- creates QR-based pairing sessions for the Android app
- stores pairing state under the OpenClaw state directory
- fetches the latest uploaded phone screen snapshot
- sends control commands and waits for command ACK status
Recommended Install
Primary release path:
openclaw plugins install @zhihand/zhihandDevelopment fallback from a local checkout:
openclaw plugins install --link /path/to/zhihand/packages/host-adapters/openclawRecommended discovery paths after npm publication:
- package README
- OpenClawDir or another community plugin directory
- external catalogs when the host deployment supports them
Do not assume a first-party plugin store UI is the only distribution path.
OpenClaw Plugin Config
The plugin reads its config from:
plugins.entries.zhihand.config
Supported fields:
controlPlaneEndpointoriginListenerdisplayNamestableIdentitypairingTTLSecondsappDownloadURLgatewayResponsesEndpointgatewayAuthTokenmobileAgentIdrequestedScopes
Normal hosted deployments can leave most fields empty.
Recommended minimum:
- no plugin config at all if
OPENCLAW_GATEWAY_TOKENis already available in the host environment - otherwise only
gatewayAuthToken
Example:
{
"plugins": {
"allow": ["zhihand"],
"entries": {
"zhihand": {
"enabled": true,
"config": {
"gatewayAuthToken": "set-this-in-deployment"
}
}
}
}
}Advanced self-host example:
{
"plugins": {
"allow": ["zhihand"],
"entries": {
"zhihand": {
"enabled": true,
"config": {
"controlPlaneEndpoint": "https://api.example.com",
"originListener": "https://host.example.zhihand.com",
"displayName": "ZhiHand @ example-host",
"stableIdentity": "openclaw-zhihand:example-host",
"pairingTTLSeconds": 600,
"appDownloadURL": "https://zhihand.com/download",
"gatewayResponsesEndpoint": "http://127.0.0.1:18789/v1/responses",
"gatewayAuthToken": "set-this-in-deployment",
"mobileAgentId": "zhihand-mobile",
"requestedScopes": [
"observe",
"session.control",
"screen.read",
"screen.capture",
"ble.control"
]
}
}
}
}
}Defaults:
controlPlaneEndpoint:https://api.zhihand.compairingTTLSeconds:600appDownloadURL:https://zhihand.com/downloadgatewayResponsesEndpoint:http://127.0.0.1:18789/v1/responsesmobileAgentId:zhihand-mobilerequestedScopes: recommended ZhiHand defaultsstableIdentity: auto-generated from hostnameoriginListener: optional; the control plane can fill a default host metadata value
Do not store secrets in this package or this public repository.
Best Practice
Use a dedicated OpenClaw agent/runtime path for ZhiHand mobile prompts.
- normal chat and phone-operation requests should use the same OpenClaw agent
- the plugin should stay thin and only provide pairing, tools, and relay glue
zhihand_*tools should be registered as optional and enabled only for the dedicated mobile agent- do not reintroduce a plugin-owned planner loop or direct
codex execorchestration inside this public plugin
Recommended deployment shape:
{
"agents": {
"list": [
{
"id": "zhihand-mobile",
"model": "openai-codex/gpt-5.4",
"tools": {
"allow": ["zhihand"]
}
}
]
}
}Why this is the preferred path:
- official OpenClaw plugin docs expect tools to be exposed to the agent runtime
- official OpenClaw CLI backend docs treat
codex-cli/*as text-only fallback paths where tools are disabled - keeping the planner inside the native runtime preserves gateway policy, auditability, and tool scoping
Deployment requirements for the native runtime path:
- the OpenClaw gateway must expose local
POST /v1/responses - the deployment must provide a gateway bearer token to the plugin
- the dedicated ZhiHand mobile agent must use a tool-capable provider model
such as
openai-codex/gpt-5.4, notcodex-cli/* - if these native-runtime prerequisites are missing, the prompt relay stays disabled and logs the configuration error during startup
Release Shape
Recommended first public release:
- Android app
- hosted
pair.zhihand.comandapi.zhihand.com - npm-published OpenClaw plugin
For non-OpenClaw hosts, publish additional thin adapters on top of the same control-plane contract instead of growing this package into a multi-host shell.
Slash Commands
/zhihand pair/zhihand status/zhihand unpair
/zhihand pair returns a browser-first pairing summary:
- app download URL
- QR URL
Open the QR URL in a browser to display the actual scannable QR page.
Tools
zhihand_pairzhihand_statuszhihand_screen_readzhihand_control
zhihand_control supports:
clicklong_clickmovemove_toswipebackhomeenterinput_textopen_appset_clipboardstart_live_capturestop_live_capture
Coordinate rules:
click,long_click, andmove_tousexRatioandyRatioin[0,1]from the latest screenshot.swipeusesx1Ratio,y1Ratio,x2Ratio, andy2Ratioin[0,1].moveusesdxRatioanddyRatioin[-1,1]for relative pointer deltas.- Do not send raw screenshot pixel coordinates through the public tool API.
zhihand_screen_readshould be treated as fresh-only visual state. If the latest uploaded snapshot is stale, the tool fails instead of letting the agent click from an old frame.- When a keyboard is visible and the goal is to submit search, send, or confirm
text, prefer
enterover clicking the IME action button. input_textsupportsmode:auto: current default, resolved on Android aspastepaste: clipboard-first plus HID paste shortcuttype: raw HID keyboard typing, reserved for sensitive fields or when paste fails
input_textalso supportssubmit=trueto send Enter immediately after the text input completes.autoandpasteoverwrite the Android system clipboard as part of the reliability trade-off. Usetypefor sensitive fields or when clipboard mutation is not acceptable.
State Files
Relative to the OpenClaw state directory:
plugins/zhihand/state.jsonstored pairing state for the host instanceplugins/zhihand/latest-screen.jpglast fetched screen snapshot cache
The adapter may automatically advance local pairing state to the latest claimed session for the same host edge when the stored pairing becomes stale. This is a host-side recovery path and does not change the public QR claim flow.
Pairing Flow
- The host registers itself against the control plane.
- The plugin creates a pairing session and pair URL.
- The pair URL is the canonical QR landing page; browsers render a scannable HTML page, while the Android app resolves the same URL in JSON mode.
- The Android app scans the QR code and claims the pairing session.
- The control plane returns a long-lived mobile credential.
- OpenClaw can then use
zhihand_status,zhihand_screen_read, andzhihand_control. - If the phone later claims a newer pairing session for the same host edge, the adapter can recover forward to that latest claimed session instead of staying pinned to an older local credential.
Mobile Prompt Path
The supported runtime path is:
- Android app uploads a mobile prompt to the control plane.
- Android may also upload prompt attachments before the prompt itself.
- The OpenClaw plugin polls pending prompts.
- The plugin downloads any prompt attachments from the control plane.
- The plugin prepares multimodal native-agent input:
- images become
input_image - supported documents become
input_file - audio attachments are transcribed into text context
- video attachments stay limited context and may use preview images
- images become
- The plugin forwards the prepared prompt to the local OpenClaw
POST /v1/responsesendpoint for the dedicated mobile agent. - The dedicated mobile agent decides whether to answer directly or call
zhihand_status,zhihand_screen_read, andzhihand_control. - The plugin writes the final assistant reply back to the control plane.
Task cancellation also uses this same path:
- If Android marks the active prompt as
cancelled, the plugin aborts the in-flight native mobile-agent run. - The final reply for that prompt becomes a system message indicating that the user stopped the task.
Capture Constraint
zhihand_screen_read returns the latest uploaded snapshot, not a live video
stream.
start_live_capture may return a permission-required result until the Android
app already has an active screen-capture session.
Attachment Best Practice
Preferred handling:
- images and documents remain raw attachments
- voice notes remain raw audio attachments and are transcribed on the host
- Android should not treat app-local speech-to-text as the canonical contract
- video support is intentionally conservative and should be treated as limited context until the deployment adds explicit video understanding
