@zerople/hermes-shim-http
v0.1.33
Published
Run Claude, Codex, or OpenCode as a local OpenAI-compatible HTTP provider for Hermes and other clients.
Maintainers
Readme
@zerople/hermes-shim-http
Run a local coding CLI like Claude, Codex, or OpenCode as an OpenAI-compatible HTTP server.
This package gives you a local provider endpoint that looks like OpenAI-compatible chat/responses APIs, while the actual reasoning is done by your installed CLI. It was designed for Hermes Agent, but it also works with other local clients that speak:
POST /v1/chat/completionsPOST /v1/responsesGET /v1/models
Highlights
- run locally with
npx @zerople/hermes-shim-http ... - supports
claude,codex, andopencode - exposes
chat/completions,responses,models, and debug observability endpoints - bootstraps its Python runtime automatically on first run
- keeps Hermes as the real tool executor when used as a Hermes provider
- reuses Claude sessions across turns and sends only delta prompts on resumed requests
- supports optional persistent session caching when
--cache-pathis set - reports fallback token/context estimates and supports context compaction controls
- emits structured request/session logs and SSE keepalive pings for long streams
What this does
@zerople/hermes-shim-http starts a small local HTTP server and forwards prompts to a local CLI such as:
claudecodexopencode
The wrapped CLI performs the reasoning. This shim translates the result into an OpenAI-compatible HTTP response, including streamed output and tool-call payloads.
In short:
- your CLI thinks and plans
- this shim exposes an HTTP API
- your client app talks to the shim like it talks to OpenAI-compatible providers
Who this is for
Use this package if you want to:
- connect a local coding CLI to Hermes Agent
- expose Claude/Codex/OpenCode through a local OpenAI-compatible endpoint
- test or prototype agent workflows without building a provider integration from scratch
- keep everything running locally on your own machine
Requirements
Before you start, make sure you have:
- Node.js 18+
- Python 3.10+ available in
PATH - Python virtual environment support available (
python3 -m venvmust work)- on Debian/Ubuntu/WSL, this often means installing
python3-venv
- on Debian/Ubuntu/WSL, this often means installing
- at least one supported CLI installed and working:
claudecodexopencode
On first run, the launcher automatically creates a cached Python virtual environment under
~/.cache/hermes-shim-http/and installs the pinned Python dependencies for you. If that bootstrap fails, runnpx @zerople/hermes-shim-http --doctorfor a preflight summary.
Quick start
1) Run it with npx
Important:
--cwdmust be a real, existing directory. Do not copy/path/to/projectliterally — it is only a placeholder. If the path does not exist, the wrapped CLI subprocess will fail withFileNotFoundError: [Errno 2] No such file or directory.
The safest way is to cd into your project first and use "$(pwd)":
cd /ABSOLUTE/PATH/TO/YOUR/PROJECT # replace with your real project path
npx @zerople/hermes-shim-http \
--host 127.0.0.1 \
--port 8765 \
--command claude \
--cwd "$(pwd)" \
--model opusOr pass an explicit absolute path you know exists. Replace /ABSOLUTE/PATH/TO/YOUR/PROJECT with your own real project directory — do not copy it literally:
npx @zerople/hermes-shim-http \
--host 127.0.0.1 \
--port 8765 \
--command claude \
--cwd /ABSOLUTE/PATH/TO/YOUR/PROJECT \
--model opusIf you prefer Codex or OpenCode, the same rule applies — replace "$(pwd)" or the absolute path with your own project location:
npx @zerople/hermes-shim-http --command codex --cwd "$(pwd)" --model codex-cli
npx @zerople/hermes-shim-http --command opencode --cwd "$(pwd)" --model opencode-cli2) Run a preflight check if you want a quick sanity check
npx @zerople/hermes-shim-http --doctor --command claude --cwd "$(pwd)"This prints a short preflight summary covering:
- detected Python interpreter
- whether Python virtual environments are supported
- whether your
--cwdexists - whether the wrapped CLI command is available
- which CLI args are effectively being used after auto defaults
3) Confirm the server is up
curl http://127.0.0.1:8765/health
curl http://127.0.0.1:8765/v1/models
curl http://127.0.0.1:8765/v1/debug/stats
curl http://127.0.0.1:8765/v1/debug/quota4) Send a test request
curl http://127.0.0.1:8765/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{
"model": "opus",
"messages": [
{"role": "user", "content": "Say hello in one short sentence."}
]
}'Observability and control flags
Useful runtime flags (current v0.1.30):
npx @zerople/hermes-shim-http \
--cache-path ~/.cache/hermes-shim-http/sessions.sqlite \
--cache-ttl-seconds 3600 \
--cache-max-entries 256 \
--compaction window \
--compaction-threshold 0.9 \
--log-level info \
--log-format textNotes:
Persistent SQLite session caching is opt-in; omit
--cache-pathto keep cache state in memory only.GET /v1/debug/statsexposes cache, latency, uptime, and token/context aggregates.GET /v1/debug/quotareturns{ "status": "unknown" }until the wrapped CLI exposes real quota data.chat/responses usage objects include fallback
context_tokens_used,context_tokens_limit, andresponse_tokensestimates.slash commands
/clear,/compact,/model <name>, and/statsare handled as normal assistant responses.long SSE streams emit
: pingcomments during idle periods to avoid proxy/client timeouts.chat/responses runtime now uses request-scoped tool-call nonces (
<tool_call nonce="...">) and only executes matching-nonce tool-call blocks from CLI output.
Debug / parsing observability env vars:
HERMES_SHIM_CLAUDE_RAW_LOG_DIR- default:
~/.hermes/hermes-shim-http/raw-logs/(enabled even when unset) - set to a custom path to override
- set to empty (
HERMES_SHIM_CLAUDE_RAW_LOG_DIR=) to disable raw log capture
- default:
HERMES_SHIM_JSON_REPAIR_ENABLED=1- opt-in malformed
<tool_call>JSON repair usingjson_repair - default is off; when repair succeeds, the shim still emits a malformed-block notice for observability
- opt-in malformed
What's new in 0.1.31
0.1.31 is a release-hygiene + documentation alignment cut:
- npm publish workflow is now trusted-publishing only (
npm publish --provenance --access publicwith GitHub OIDC) - maintainer release docs and roadmap were refreshed to match current behavior
- README operator guidance/CLI options were cleaned up to remove stale
0.1.7references
For runtime protocol hardening changes, see 0.1.30 in CHANGELOG.md.
Recommended usage with Hermes Agent
Start the shim (use your real project path, not the example below):
cd /ABSOLUTE/PATH/TO/YOUR/PROJECT # replace with your real project path
npx @zerople/hermes-shim-http \
--host 127.0.0.1 \
--port 8765 \
--command claude \
--cwd "$(pwd)" \
--model opusThen point Hermes to the local provider.
If you are currently using openai-codex
A lot of users already have a config.yaml that looks something like this:
model:
base_url: https://chatgpt.com/backend-api/codex
context_length: 1000000
default: gpt-5.4
provider: openai-codex
api_key: no-key-required
providers: {}
fallback_providers: []You do not need to delete that from memory and figure everything out again. The easiest way is:
- keep a copy of your old config
- replace the active
model:block with one of the shim examples below - add a
custom_providers:section
If you want, you can even keep your old provider as a commented backup in the same file.
Option A: use the shim via chat_completions
This is usually the easiest place to start.
model:
default: opus
provider: custom
base_url: http://127.0.0.1:8765/v1
api_key: no-key-required
api_mode: chat_completions
context_length: 1000000
providers: {}
custom_providers:
- name: claude-shim
base_url: http://127.0.0.1:8765/v1
api_key: no-key-required
api_mode: chat_completions
model: opus
fallback_providers: []Option B: use the shim via codex_responses
Use this if you specifically want Hermes to talk to the shim through /v1/responses.
model:
default: opus
provider: custom
base_url: http://127.0.0.1:8765/v1
api_key: no-key-required
api_mode: codex_responses
context_length: 1000000
providers: {}
custom_providers:
- name: claude-shim-responses
base_url: http://127.0.0.1:8765/v1
api_key: no-key-required
api_mode: codex_responses
model: opus
fallback_providers: []Example with the old config kept as a backup
This can make editing less stressful for users who want a clear before/after reference:
# old setup
# model:
# base_url: https://chatgpt.com/backend-api/codex
# context_length: 1000000
# default: gpt-5.4
# provider: openai-codex
# api_key: no-key-required
model:
default: opus
provider: custom
base_url: http://127.0.0.1:8765/v1
api_key: no-key-required
api_mode: codex_responses
context_length: 1000000
providers: {}
custom_providers:
- name: claude-shim-responses
base_url: http://127.0.0.1:8765/v1
api_key: no-key-required
api_mode: codex_responses
model: opus
fallback_providers: []Which one should you use?
- use
chat_completionsif you want the simplest, most familiar setup - use
codex_responsesif you want Hermes to call the shim's/v1/responsesendpoint
A practical rule of thumb:
- start with
chat_completions - switch to
codex_responsesonly if you specifically want the Responses-style flow
CLI options
For the full, up-to-date list always check:
npx @zerople/hermes-shim-http --helpKey options:
--host HOST
--port PORT
--command COMMAND
--cwd CWD
--timeout TIMEOUT
--model MODELS
--fallback-model FALLBACK_MODEL
--profile {auto,claude,codex,opencode,generic}
--cache-path CACHE_PATH
--cache-ttl-seconds CACHE_TTL_SECONDS
--cache-max-entries CACHE_MAX_ENTRIES
--compaction {off,summarize,window}
--compaction-threshold COMPACTION_THRESHOLD
--log-level {info,debug}
--log-format {text,json}
--heartbeat-wrap | --no-heartbeat-wrap
--heartbeat-interval HEARTBEAT_INTERVAL
--single-child-lock | --no-single-child-lock
--single-child-lock-path SINGLE_CHILD_LOCK_PATH
--http-heartbeat-interval HTTP_HEARTBEAT_INTERVAL
--strict-mcp-config | --no-strict-mcp-config
--live-child-pool | --no-live-child-pool
--live-child-pool-size LIVE_CHILD_POOL_SIZE
--live-child-pool-idle-ttl LIVE_CHILD_POOL_IDLE_TTLCommon options
--host
Bind address. Default is usually127.0.0.1.--port
HTTP port to listen on.--command
Which local CLI executable to run, for exampleclaude,codex, oropencode.--cwd
Working directory passed to the wrapped CLI. Use your project root here.--timeout
Maximum runtime for a single CLI request.--model
Model name reported back to API clients. For Claude-backed usage, the built-in model list issonnet,opus, andhaiku, so using one of those names is recommended.--profile
Command profile.autousually picks a sensible default based on--command. For Claude with no explicit-- ...CLI args, the shim defaults to-p --dangerously-skip-permissionsso the launcher works out of the box for Hermes-style tool-selection flows.
Supported API behavior
Main endpoints
GET /healthGET /v1/modelsPOST /v1/chat/completionsPOST /v1/responses
Streaming
Supports streaming responses for both:
- Chat Completions API
- Responses API
Tool calls
Tool calls are supported through shim parsing of CLI output blocks like:
<tool_call>{...}</tool_call>Only tools advertised by the client request are forwarded as callable tool payloads.
Compatibility probe endpoints
To avoid noisy probe failures from clients, these compatibility routes also return benign responses:
GET /api/v1/modelsGET /api/tagsGET /v1/propsGET /propsGET /version
How first-run bootstrap works
When you run:
npx @zerople/hermes-shim-http ...the Node launcher will:
- locate
python3(or another usable Python 3 executable) - create a cached virtualenv in
~/.cache/hermes-shim-http/ - install pinned Python dependencies from
requirements.txt - launch
python -m hermes_shim_http.server
This means most users do not need to manually create a venv to use the package.
Troubleshooting
Python 3 is required but was not found in PATH
Install Python 3 and make sure one of these works in your shell:
python3 --version
# or
python --versionPython virtual environment support is missing / ensurepip is not available
The launcher needs python3 -m venv to work on first run. If your system Python was installed without venv support, the launcher now reports a friendlier error and suggests the package to install.
On Debian/Ubuntu/WSL, try:
sudo apt update
sudo apt install python3-venvIf your distro uses versioned packages, install the one that matches your Python version instead, for example:
sudo apt install python3.12-venv
# or
sudo apt install python3.11-venvYou can also run the built-in preflight check:
npx @zerople/hermes-shim-http --doctor --command claude --cwd "$(pwd)"My CLI command is not found
Make sure the command you pass in --command is already installed and available in your shell:
claude --help
codex --help
opencode --helpFileNotFoundError: [Errno 2] No such file or directory: '/path/to/project'
This means you passed a --cwd value that does not exist on your machine. The /path/to/project string in the docs is only a placeholder — you must replace it with a real, existing directory.
Fix it by either:
cdinto your real project and use--cwd "$(pwd)", or- pass an absolute path you know exists, e.g.
--cwd /ABSOLUTE/PATH/TO/YOUR/PROJECT(replace with your actual directory)
You can confirm the path exists before launching:
ls -ld /ABSOLUTE/PATH/TO/YOUR/PROJECTThe shim starts but the CLI behaves oddly
Check the --cwd value first. Many tool-use and file-access issues happen because the wrapped CLI is running in the wrong working directory.
Note that --cwd only sets the working directory for the wrapped CLI subprocess. It does not change the working directory of the client that consumes the shim's responses. If the model emits tool calls with relative paths, the consuming client may resolve them against a different directory. Prefer absolute paths in tool-call arguments when possible.
First run is slow
That is expected. The launcher creates a cached Python environment and installs dependencies on the first run. Later runs should be faster.
If first-run bootstrap fails, run:
npx @zerople/hermes-shim-http --doctor --command claude --cwd "$(pwd)"before retrying the normal launch command.
Development
Clone the repository and run tests locally:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
npm testUseful commands:
node bin/hermes-shim-http.js --help
node bin/hermes-shim-http.js --doctor --command claude --cwd "$(pwd)"
npm packAdditional documentation
- Maintainers:
docs/maintainers/releasing.md - Architecture/background:
docs/architecture/hermes-integration.md
License
MIT
