@readtt/claude-max-api-proxy

v1.5.0

Published

12 days ago

Use your Claude Max subscription with any OpenAI-compatible client. Wraps Claude Code CLI as an OpenAI-compatible API server.

Downloads

167

0High
0Medium
0Low

readtt

claude anthropic claude-code claude-max openai-compatible api-server oauth clawdbot llm ai

Claude Max API Proxy

Use your Claude Max subscription as an OpenAI-compatible API. Any OpenAI client (Continue.dev, Cursor, OpenClaw, the OpenAI SDKs, curl) talks to this proxy on localhost, and the proxy runs your prompts through the Claude Code CLI you're already paying for — no per-token API bill.

Subject to Anthropic's fair use policy. This wraps the official claude CLI; it does not extract tokens or bypass auth.

Requirements

A Claude Max subscription
Node.js 20+

Claude Code CLI, installed and logged in:

npm install -g @anthropic-ai/claude-code
claude          # log in once, interactively

Run it

Install from npm (quickest):

npm install -g @readtt/claude-max-api-proxy
claude-max-api            # starts on http://localhost:3456

To use a different port: claude-max-api 8080.

Or run from source:

git clone https://github.com/Readtt/claude-max-api-proxy.git
cd claude-max-api-proxy
npm install
npm run serve     # builds, then starts on http://localhost:3456

Either way, quick check:

curl http://localhost:3456/v1/models

Use it

Point any OpenAI client at http://localhost:3456/v1.

curl -X POST http://localhost:3456/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-opus-4-8","messages":[{"role":"user","content":"Hello!"}]}'

Add "stream": true for SSE token streaming.

Python (OpenAI SDK):

from openai import OpenAI
client = OpenAI(base_url="http://localhost:3456/v1", api_key="not-needed")
client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[{"role": "user", "content": "Hello!"}],
)

Continue.dev / Cursor: add a model with provider: openai, apiBase: http://localhost:3456/v1, any apiKey.

Models

Two ways to choose a model, so you never have to update the proxy for new releases:

Latest in a family — use a bare alias: opus, sonnet, or haiku (also matched in any name, e.g. claude-opus-4 → latest Opus).
Pin a specific version — use the full ID and it's passed straight to the CLI: claude-opus-4-7, claude-sonnet-4-5-20250929, etc. Availability depends on your subscription.

Provider prefixes are fine too: anthropic/..., claude-max/..., claude-code-cli/.... Unknown names default to the latest Opus.

GET /v1/models lists the three family aliases (always the latest), so it never goes stale. To also advertise specific pinned IDs (e.g. for a UI model picker), set CLAUDE_PROXY_MODELS=claude-opus-4-8,claude-sonnet-4-6.

OpenAI compatibility

Use it as a drop-in OpenAI endpoint: chat, streaming, function/tool calling, JSON mode (response_format), image input/vision (image_url), and reasoning_effort all work. Sampling params like temperature and max_tokens are accepted but ignored (the CLI can't honor them), and embeddings/image-generation/audio aren't available. Full matrix: COMPATIBILITY.md.

Endpoints

| Endpoint | Description | |----------|-------------| | POST /v1/chat/completions | Chat (streaming + non-streaming) | | GET /v1/models | List models | | GET /v1/models/{id} | Retrieve a single model | | GET /v1/usage, GET /v1/usage/recent | Token usage + estimated savings | | GET /health | Health check |

Optional: API key auth

For shared/team use, require a Bearer token:

API_KEYS=sk-team-abc,sk-team-def npm run serve

Clients then send Authorization: Bearer sk-team-abc. Unset = no auth.

Configuration

All optional, set as environment variables:

| Variable | Default | Purpose | |----------|---------|---------| | API_KEYS | (unset) | Comma-separated Bearer tokens to require (see above). | | CLAUDE_PROXY_MODELS | (unset) | Extra pinned model IDs to list in /v1/models. | | SYSTEM_PROMPT_MODE | replace | replace = your system prompt fully defines the persona (neutral, OpenAI-like). append = add it on top of Claude Code's default prompt. | | LOG_LEVEL | info | error, warn, info, or debug. info logs each request/response (model, duration, tokens); debug adds per-chunk subprocess detail. | | DEBUG | (unset) | Legacy alias — any value forces LOG_LEVEL=debug. |

Notes

Prompts go to the CLI via stdin, and the system prompt via a temp file — neither touches the command line, so large requests (e.g. code-review diffs + tool definitions) don't hit the OS argument-length limit (~32 KB on Windows / E2BIG elsewhere).
See ARCHITECTURE.md for how it works and PROTOCOL.md for the request/response mapping.

License

MIT