@victor-software-house/pi-openai-proxy
v4.9.2
Published
OpenAI-compatible HTTP proxy for pi's multi-provider model registry
Readme
pi-openai-proxy
A local OpenAI-compatible HTTP proxy built on pi's SDK. Routes requests through pi's multi-provider model registry and credential management, exposing a single http://localhost:4141/v1/... endpoint that any OpenAI-compatible client can connect to.
Why
- Single gateway to 20+ LLM providers (Anthropic, OpenAI, Google, Bedrock, Mistral, xAI, Groq, OpenRouter, Vertex, etc.) via one OpenAI-compatible API
- No duplicate config — reuses pi's
~/.pi/agent/auth.jsonandmodels.jsonfor credentials and model definitions - Self-hosted — runs locally, no third-party proxy services
- Streaming — full SSE streaming with token usage and cost tracking
- Strict validation — unsupported parameters are rejected clearly, not silently ignored
Prerequisites
- pi must be installed
- At least one provider must be configured via
pi /login - Bun (for development) or Node.js >= 20 (for production)
Installation
# Install globally
npm install -g @victor-software-house/pi-openai-proxy
# Or run directly with npx
npx @victor-software-house/pi-openai-proxyQuickstart
# Start the proxy (defaults to http://127.0.0.1:4141)
pi-openai-proxyList available models
curl http://localhost:4141/v1/models | jq '.data[].id'Chat completion (non-streaming)
curl http://localhost:4141/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello!"}]
}'Chat completion (streaming)
curl http://localhost:4141/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Tell me a joke"}],
"stream": true
}'Use with any OpenAI-compatible client
Point any client that supports OPENAI_API_BASE (or equivalent) at http://localhost:4141/v1:
# Example: Aider
OPENAI_API_BASE=http://localhost:4141/v1 aider --model anthropic/claude-sonnet-4-20250514
# Example: Continue (in settings.json)
# "apiBase": "http://localhost:4141/v1"
# Example: Open WebUI
# Set "OpenAI API Base URL" to http://localhost:4141/v1Model resolution
The proxy resolves model references in this order:
- Exact public ID match — the ID from
GET /v1/models - Canonical ID fallback —
provider/model-idformat (only for exposed models)
With the default collision-prefixed mode and no collisions, model IDs are exposed without prefixes:
curl http://localhost:4141/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hi"}]}'Supported Endpoints
| Endpoint | Description |
|---|---|
| GET /v1/models | List exposed models (filtered by exposure mode, only those with configured credentials) |
| GET /v1/models/{model} | Model details by public ID or canonical ID (supports URL-encoded IDs with /) |
| POST /v1/chat/completions | Chat completions (streaming and non-streaming) |
Supported Chat Completions Features
| Feature | Notes |
|---|---|
| model | Public ID, canonical (provider/model-id), or collision-prefixed shorthand |
| messages (text) | system, developer, user, assistant, tool roles |
| messages (base64 images) | Base64 data URI image content parts (image/png, image/jpeg, image/gif, image/webp) |
| stream | SSE with text_delta and toolcall_delta mapping |
| temperature | Direct passthrough |
| max_completion_tokens | Preferred; max_tokens accepted as deprecated fallback |
| stop | Via passthrough |
| user | Via passthrough |
| stream_options.include_usage | Final usage chunk in SSE stream |
| tools / tool_choice | JSON Schema -> TypeBox conversion (supported subset); tool_choice translated per-provider |
| tool_calls in messages | Assistant tool call + tool result roundtrip |
| parallel_tool_calls | Forwarded to OpenAI/Codex; translated to disable_parallel_tool_use for Anthropic |
| reasoning_effort | Maps to pi's ThinkingLevel (none, minimal, low, medium, high, xhigh) |
| response_format | text, json_object, and json_schema (OpenAI-compatible APIs only) |
| top_p | Supported by OpenAI, Anthropic (native), and Google (translated to topP) |
| frequency_penalty | OpenAI and Google only (translated to frequencyPenalty) |
| presence_penalty | OpenAI and Google only (translated to presencePenalty) |
| seed | OpenAI and Google only (translated to nested generationConfig.seed) |
| stop | Translated per-provider (stop_sequences for Anthropic, stopSequences for Google) |
| metadata | OpenAI passthrough; arbitrary keys silently skipped for other providers |
| prediction | OpenAI passthrough only (speculative decoding) |
Not supported: n > 1, logprobs, logit_bias, remote image URLs (disabled by default).
Provider-aware field translation
Fields are translated to each provider's native format, not blindly forwarded:
- OpenAI / Mistral: All fields passed directly (same wire format)
- Codex (Responses API): Only
tool_choiceandparallel_tool_calls(other fields rejected by API) - Anthropic:
tool_choice→{ type }format,stop→stop_sequences,user→metadata.user_id,parallel_tool_calls: false→disable_parallel_tool_use - Google: Nested into
generationConfigwith camelCase names (topP,stopSequences,seed, etc.)
Fields that have no equivalent in a target provider are silently skipped — the request succeeds and the provider's default applies.
Model Naming and Exposure
Public model IDs
The proxy generates public model IDs based on a configurable ID mode:
| Mode | Behavior | Example |
|---|---|---|
| collision-prefixed (default) | Raw model IDs; prefix only providers that share a model name | gpt-4o or openai/gpt-4o if codex also has gpt-4o |
| universal | Raw model IDs only; rejects config if duplicates exist | gpt-4o, claude-sonnet-4-20250514 |
| always-prefixed | Always <prefix>/<model-id> | openai/gpt-4o, anthropic/claude-sonnet-4-20250514 |
The collision-prefixed mode prefixes all models from providers that form a connected conflict group (not just the colliding model names).
Exposure modes
Control which models appear in the API:
| Mode | Behavior |
|---|---|
| all (default) | Expose every model with configured credentials |
| scoped | Expose models from selected providers only (scopedProviders) |
| custom | Expose an explicit allowlist of canonical IDs (customModels) |
Canonical model IDs
Internal canonical IDs use the provider/model-id format matching pi's registry:
anthropic/claude-sonnet-4-20250514
openai/gpt-4o
google/gemini-2.5-pro
xai/grok-3
openrouter/anthropic/claude-sonnet-4-20250514Canonical IDs are accepted as backward-compatible fallback in requests, but only for models that are currently exposed. Hidden models cannot be reached by canonical ID.
Configuration
What comes from pi
The proxy reads two files from pi's configuration directory (~/.pi/agent/):
| File | Managed by | What the proxy uses |
|---|---|---|
| auth.json | pi /login | API keys for each provider (Anthropic, OpenAI, Google, etc.) |
| models.json | pi built-in + user edits | Model definitions, capabilities, and pricing |
The proxy does not read pi's settings.json (installed packages, enabled extensions) or session-level model filters (--models flag). All models with configured credentials are exposed through the proxy, regardless of pi session scope.
What the proxy adds
Proxy-specific settings are configured via environment variables or the /proxy config panel (when installed as a pi package):
| Setting | Env variable | Default | Description |
|---|---|---|---|
| Bind address | PI_PROXY_HOST | 127.0.0.1 | Network interface (127.0.0.1 = local only, 0.0.0.0 = all) |
| Port | PI_PROXY_PORT | 4141 | HTTP listen port |
| Auth token | PI_PROXY_AUTH_TOKEN | (disabled) | Bearer token for proxy authentication |
| Remote images | PI_PROXY_REMOTE_IMAGES | false | Allow remote image URL fetching |
| Max body size | PI_PROXY_MAX_BODY_SIZE | 52428800 (50 MB) | Maximum request body size in bytes |
| Upstream timeout | PI_PROXY_UPSTREAM_TIMEOUT_MS | 120000 (120s) | Upstream request timeout in milliseconds |
Model exposure settings are configured via the JSON config file (~/.pi/agent/proxy-config.json):
| Setting | Default | Description |
|---|---|---|
| publicModelIdMode | collision-prefixed | Public ID format: collision-prefixed, universal, always-prefixed |
| modelExposureMode | all | Which models to expose: all, scoped, custom |
| scopedProviders | [] | Provider keys for scoped mode |
| customModels | [] | Canonical model IDs for custom mode |
| providerPrefixes | {} | Provider key -> custom prefix label overrides |
When used as a pi package, all settings are persisted in ~/.pi/agent/proxy-config.json and applied when the extension spawns the proxy.
Discovering available models
List all models the proxy exposes (filtered by the active exposure mode):
curl http://localhost:4141/v1/models | jq '.data[].id'Model objects follow the standard OpenAI shape (id, object, created, owned_by) with no proprietary extensions.
Per-request API key override
The X-Pi-Upstream-Api-Key header overrides the registry-resolved API key for a single request. This keeps Authorization available for proxy authentication:
curl http://localhost:4141/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-Pi-Upstream-Api-Key: sk-your-key-here" \
-d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hi"}]}'Proxy authentication
Set PI_PROXY_AUTH_TOKEN to require a bearer token for all requests:
PI_PROXY_AUTH_TOKEN=my-secret-token pi-openai-proxy
# Clients must include the token
curl http://localhost:4141/v1/models \
-H "Authorization: Bearer my-secret-token"API compatibility
The proxy implements a subset of the OpenAI Chat Completions API. Request and response shapes match the OpenAI specification for supported fields. Unsupported fields are rejected with 422 and an OpenAI-style error body naming the offending parameter.
There is no OpenAPI/Swagger spec for the proxy itself. Use the OpenAI API reference as the primary documentation, noting the supported subset listed in this README.
Pi Integration
Install as a pi package to get the /proxy command family and --proxy flag:
pi install npm:@victor-software-house/pi-openai-proxyCommand family
/proxy Open the settings panel
/proxy start Start the proxy server
/proxy stop Stop the proxy server (session-managed only)
/proxy status Show proxy status
/proxy config Open the settings panel
/proxy show Summarize current configuration
/proxy path Show config file location
/proxy reset Restore default settings
/proxy help Show usageSettings panel
/proxy (or /proxy config) opens an interactive settings panel where you can configure the bind address, port, auth token, remote images, body size limit, and upstream timeout. Changes are saved to ~/.pi/agent/proxy-config.json immediately. Restart the proxy to apply changes.
Standalone (background) mode
For a proxy that outlives pi sessions, run the binary directly:
# Foreground
pi-openai-proxy
# Background
pi-openai-proxy &
# With custom port
PI_PROXY_PORT=8080 pi-openai-proxy &The extension detects externally running instances and shows their status via /proxy status without trying to manage them.
Architecture
┌─────────────────────────────┐ ┌──────────────────────────────────┐
│ HTTP Client │ │ pi-openai-proxy │
│ (curl, Aider, Continue, │ │ │
│ LiteLLM, Open WebUI, etc.)│ │ ┌────────────────────────────┐ │
└─────────────┬───────────────┘ │ │ Hono HTTP Server │ │
│ │ │ ├─ Request parser │ │
│ POST /v1/chat/ │ │ ├─ Message converter │ │
│ completions │ │ ├─ Model resolver │ │
├────────────────────────►│ │ ├─ Tool converter │ │
│ │ │ └─ SSE encoder │ │
│ GET /v1/models │ └────────────────────────────┘ │
├────────────────────────►│ │
│ │ ┌────────────────────────────┐ │
│ │ │ Pi SDK │ │
│ SSE / JSON │ │ ├─ ModelRegistry │ │
│◄────────────────────────┤ │ ├─ AuthStorage │ │
│ │ │ ├─ streamSimple() │ │
│ │ └─ completeSimple() │ │
│ └────────────────────────────┘ │
└──────────────────────────────────┘Pi SDK layers used
@mariozechner/pi-ai—streamSimple(),completeSimple(),Model,Usage,AssistantMessageEvent@mariozechner/pi-coding-agent—ModelRegistry,AuthStorage
Security defaults
- Binds to
127.0.0.1(localhost only) by default - Remote image URLs disabled by default
- Request body size limited to 50 MB
- Upstream timeout of 120 seconds
- Secrets are never included in error responses
- Client disconnects abort upstream work immediately
Dev Workflow
bun install # Install dependencies
bun run dev # Run in development
bun run build # Build for npm (tsdown)
bun run typecheck # TypeScript strict check
bun run lint # Biome + oxlint (strict)
bun test # Run all testsTooling
- Bun — runtime, test runner, package manager
- tsdown — npm build (ESM + .d.ts)
- Biome — format + lint
- oxlint — type-aware lint with strict rules (
.oxlintrc.json) - lefthook — pre-commit hooks (format, lint, typecheck), pre-push hooks (test)
- commitlint — conventional commits
- semantic-release — automated versioning and npm publish
- mise — tool version management (node, bun)
License
MIT
