@miloveme/claude-code-api

v0.1.7

Published

9 days ago

OpenAI API-compatible MCP channel server for Claude Code

0High
0Medium
0Low

miloveme

claude-code-api

Turn your Claude Code session into an OpenAI-compatible API server.

Any client that speaks the OpenAI chat completions format — Cursor, Continue, Open WebUI, custom scripts, curl — can point at this server and interact with your running Claude Code session. Supports both standard and SSE streaming responses.

Prerequisites

Node.js >= 18

Quick Setup

1. Install the MCP server.

These are Claude Code commands — run claude to start a session first.

Add to your Claude Code config (~/.claude.json):

{
  "mcpServers": {
    "openai-compat": {
      "command": "npx",
      "args": ["-y", "@miloveme/claude-code-api@latest"]
    }
  }
}

Or install globally and point at the binary:

npm install -g @miloveme/claude-code-api

Note: The MCP server key name (e.g. "openai-compat") can be anything you want. Just make sure the server: name in the launch command matches. For example, if you name it "my-api", launch with server:my-api.

2. Launch Claude Code with the channel.

# Basic — Claude will ask for permission on each API request
claude --dangerously-load-development-channels server:openai-compat

# Recommended — skip permission prompts for uninterrupted API usage
claude --dangerously-load-development-channels server:openai-compat --dangerously-skip-permissions

Why --dangerously-skip-permissions? Without this flag, Claude Code prompts for tool approval every time an API request arrives (e.g. "Do you want to allow the reply tool?"). This blocks the HTTP response until you manually approve it in the terminal. With the flag, all tool calls are auto-approved so API clients get responses without manual intervention. Only use this on a trusted local machine.

3. Send a request.

# Non-streaming
curl http://localhost:8787/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"claude-code","messages":[{"role":"user","content":"Hello!"}]}'

# Streaming (SSE)
curl -N http://localhost:8787/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"claude-code","messages":[{"role":"user","content":"Hello!"}],"stream":true}'

Configuration

All settings are via environment variables. Set them in the env block of your MCP config:

{
  "mcpServers": {
    "openai-compat": {
      "command": "npx",
      "args": ["-y", "@miloveme/claude-code-api@latest"],
      "env": {
        "OPENAI_COMPAT_PORT": "8787",
        "OPENAI_COMPAT_API_KEY": "my-secret-key",
        "OPENAI_COMPAT_MODEL": "claude-code",
        "OPENAI_COMPAT_TIMEOUT": "600000"
      }
    }
  }
}

| Variable | Default | Description | | --- | --- | --- | | OPENAI_COMPAT_PORT | 8787 | HTTP server port | | OPENAI_COMPAT_API_KEY | (none) | Bearer token for authentication. When set, all API requests must include Authorization: Bearer <key>. | | OPENAI_COMPAT_MODEL | claude-code | Model name returned in /v1/models and completion responses | | OPENAI_COMPAT_TIMEOUT | 600000 | Request timeout in milliseconds (default 10 minutes) | | OPENAI_COMPAT_LOG | ~/.claude/channels/openai-compat/requests.log | Log file path | | OPENAI_COMPAT_LOG_LEVEL | error | Log verbosity: none, error, info, debug |

API Endpoints

| Method | Path | Description | | --- | --- | --- | | GET | / or /health | Health check — no auth required | | GET | /v1/models | List available models | | POST | /v1/chat/completions | Chat completions (OpenAI-compatible) |

POST /v1/chat/completions

Accepts the standard OpenAI request body:

{
  "model": "claude-code",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is 2+2?" }
  ],
  "stream": false
}

stream: false (default) — Returns a single JSON response when Claude finishes.
stream: true — Returns Server-Sent Events (SSE) with data: chunks in real time, terminated by data: [DONE].

Tools exposed to the assistant

| Tool | Purpose | | --- | --- | | reply | Send a response back to the API client. Takes request_id (from the inbound channel notification), text (response content), and optional done (boolean). For non-streaming: call once with done=true. For streaming: call multiple times with done=false, then a final call with done=true. |

Client Examples

Cursor / Continue

Point the OpenAI-compatible provider at:

http://localhost:8787/v1

Set the API key if configured, and use claude-code as the model name.

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8787/v1",
    api_key="my-secret-key",  # or "no-key" if auth is disabled
)

response = client.chat.completions.create(
    model="claude-code",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

curl

curl -N http://localhost:8787/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer my-secret-key' \
  -d '{"model":"claude-code","messages":[{"role":"user","content":"Hello!"}],"stream":true}'

How it works

This is an MCP server that declares the claude/channel capability. When Claude Code launches with the channel flag, the server:

Connects to Claude Code over stdio (MCP protocol)
Starts an HTTP server on the configured port
Receives OpenAI-format requests from clients
Forwards them as notifications/claude/channel events to Claude Code
Claude processes the message and calls the reply tool
The reply is sent back to the HTTP client as an OpenAI-format response

┌─────────┐    HTTP     ┌────────────────┐   MCP/stdio   ┌─────────────┐
│  Client  │ ─────────► │ claude-code-api │ ◄───────────► │ Claude Code │
│ (Cursor) │ ◄───────── │   :8787        │               │             │
└─────────┘   JSON/SSE  └────────────────┘               └─────────────┘

Port conflicts

If the port is already in use by a previous session, the server automatically kills the zombie process and reclaims the port. No manual intervention needed.

Logging

All requests and responses are logged to:

stderr — visible in Claude Code's MCP server logs
file — ~/.claude/channels/openai-compat/requests.log (configurable via OPENAI_COMPAT_LOG)

Log levels control verbosity (OPENAI_COMPAT_LOG_LEVEL):

| Level | What gets logged | | --- | --- | | none | Nothing | | error | Timeouts and errors only (default) | | info | Requests + responses + errors | | debug | Everything including stream cancellations |

# Monitor logs in real time
tail -f ~/.claude/channels/openai-compat/requests.log

License

MIT