@relayplane/openclaw

v0.1.0

Published

2 days ago

OpenClaw skill adapter for RelayPlane agent optimization

0High
0Medium
0Low

uxcontinuum

relayplane openclaw mcp skill ai

@relayplane/proxy

Intelligent LLM Proxy — Route smarter, spend less.

RelayPlane is a local proxy that sits between your AI tools (like Claude Code, Cursor, or custom apps) and LLM providers (Anthropic, OpenAI, etc.). It automatically routes your requests to the most cost-effective model for each task, potentially saving 50-80% on API costs.

⚠️ Important: Cost Monitoring Required
RelayPlane routes requests directly to LLM provider APIs using your API keys. This incurs real costs on your account.
Before using RelayPlane:
Set up billing alerts with your LLM providers (Anthropic, OpenAI, etc.)
Monitor your usage through your provider's dashboard
Start with test requests to understand routing behavior
Review the /control/stats endpoint to track request volume
RelayPlane provides cost optimization, not cost elimination. You are responsible for monitoring your actual API spending.

What Does This Do?

The Problem

You're using Claude Code, Cursor, or another AI tool. Every request goes to expensive models like Claude Opus or GPT-4, even for simple tasks like "what's 2+2?" or "fix this typo."

The Solution

RelayPlane intercepts your API calls and:

Analyzes the task — Is this a simple question or complex analysis?
Routes intelligently — Simple tasks → cheap models (Haiku), complex tasks → powerful models (Opus)
Learns from outcomes — Tracks what works and improves over time

Result: Same quality results, 50-80% lower costs.

Key Features

| Feature | What It Does | |---------|--------------| | Cascade Routing | Starts with cheap model, escalates if response seems uncertain | | Model Suffixes | Add :cost, :fast, or :quality to any model name | | Provider Cooldowns | Automatically skips failing providers | | 100% Local | All data stays on your machine — no cloud dependency | | Hot Reload | Edit config file, changes apply immediately |

Quick Start

Step 1: Install

npm install -g @relayplane/proxy

Step 2: Start the Proxy

relayplane-proxy --port 3001 --verbose

You should see:

RelayPlane proxy listening on http://127.0.0.1:3001
  Endpoints:
    POST /v1/messages          - Native Anthropic API
    POST /v1/chat/completions  - OpenAI-compatible API
  Models: relayplane:auto, relayplane:cost, relayplane:fast, relayplane:quality

Step 3: Point Your Tools at the Proxy

Add this to your shell profile (~/.bashrc, ~/.zshrc, etc.):

export ANTHROPIC_BASE_URL=http://localhost:3001

Then restart your terminal or run source ~/.bashrc.

Step 4: Use Your Tools Normally

Claude Code, Cursor, or any OpenAI-compatible tool will now route through RelayPlane automatically. You don't need to change anything else.

Security

⚠️ Important: Read this section before deploying.

Local-Only by Default

RelayPlane binds to 127.0.0.1 (localhost) by default. This means:

✅ Only programs on your computer can access it
✅ It's not exposed to your network or the internet
✅ No authentication is required (you're the only user)

What RelayPlane Has Access To

When you route through RelayPlane, it can see:

| Data | Access Level | |------|--------------| | Your prompts | Full access (needed for routing decisions) | | API responses | Full access (needed for format conversion) | | API keys | Passed through to providers, not stored permanently | | Usage data | Stored locally in ~/.relayplane/data.db |

What RelayPlane Does NOT Do

❌ Send data to any external service (except the LLM providers you configure)
❌ Store API keys on disk (they're only in memory during requests)
❌ Log prompts or responses to files
❌ Phone home or check for updates

If You Expose It to the Network

Don't expose RelayPlane to the public internet or untrusted networks. If you do:

Control endpoints (/control/*) have no authentication
Anyone who can reach the proxy can enable/disable routing
Anyone can read your config and usage stats
Consider adding a reverse proxy with authentication (nginx, Caddy, etc.)

Security Hardening Checklist

| Setting | Recommendation | |---------|----------------| | Bind address | Keep as 127.0.0.1 (default) | | API keys | Use environment variables, not config files | | Config file | Stored at ~/.relayplane/config.json with 600 permissions | | Data file | Stored at ~/.relayplane/data.db — delete to reset | | Request size | Limited to 10MB to prevent abuse |

How It Works

The Request Flow

Your Tool (Claude Code, Cursor, etc.)
         │
         ▼
    RelayPlane Proxy (localhost:3001)
         │
         ├─── 1. Extract task type from prompt
         ├─── 2. Check routing rules
         ├─── 3. Select optimal model
         ├─── 4. Forward to provider
         ├─── 5. Convert response format if needed
         └─── 6. Return to your tool
         │
         ▼
    LLM Provider (Anthropic, OpenAI, etc.)

Routing Modes

| Mode | Behavior | When to Use | |------|----------|-------------| | relayplane:auto | Cascade routing — starts cheap, escalates if needed | Default, best for most cases | | relayplane:cost | Always use cheapest model | When cost is critical | | relayplane:fast | Use fastest model/provider | Real-time applications | | relayplane:quality | Always use best model | When quality is critical |

Model Suffixes

Append a suffix to any model name to override routing for that request:

# Force cheap routing for this request
model: "claude-3-5-sonnet:cost"

# Force quality routing
model: "gpt-4o:quality"

# Force fast routing  
model: "relayplane:auto:fast"

Cascade Routing (Default)

When you use relayplane:auto, here's what happens:

Send to cheap model first (e.g., Claude Haiku)
Check response for uncertainty — phrases like "I'm not sure", "I don't know"
If uncertain, escalate to the next model (e.g., Claude Sonnet)
Return first confident response

This means simple questions get fast, cheap answers, while complex questions automatically get smarter models.

Provider Cooldowns

If a provider fails repeatedly (3 times in 60 seconds by default), RelayPlane temporarily stops sending requests to it. This prevents cascading failures and wasted API calls.

After the cooldown period (120 seconds by default), RelayPlane tries the provider again.

Configuration

Config File Location

~/.relayplane/config.json

This file is created automatically on first run. Changes are applied immediately (hot reload).

Full Config Example

{
  "enabled": true,
  
  "modelOverrides": {
    "claude-opus-4-5": "claude-sonnet-4"
  },
  
  "routing": {
    "mode": "cascade",
    "cascade": {
      "enabled": true,
      "models": [
        "claude-3-haiku-20240307",
        "claude-3-5-sonnet-20241022",
        "claude-3-opus-20240229"
      ],
      "escalateOn": "uncertainty",
      "maxEscalations": 1
    },
    "complexity": {
      "enabled": true,
      "simple": "claude-3-haiku-20240307",
      "moderate": "claude-3-5-sonnet-20241022",
      "complex": "claude-3-opus-20240229"
    }
  },
  
  "reliability": {
    "cooldowns": {
      "enabled": true,
      "allowedFails": 3,
      "windowSeconds": 60,
      "cooldownSeconds": 120
    }
  }
}

Config Options Explained

`enabled` (boolean)

true — Normal routing behavior
false — Pure passthrough mode (no routing, no telemetry)

`modelOverrides` (object)

Automatically rewrite model names. Useful for forcing cheaper models:

{
  "modelOverrides": {
    "claude-opus-4-5": "claude-sonnet-4",
    "gpt-4o": "gpt-4o-mini"
  }
}

`routing.mode` ("cascade" | "standard")

"cascade" — Start cheap, escalate on uncertainty (recommended)
"standard" — Use complexity classifier only

`routing.cascade.models` (array)

Order matters! First model is tried first (should be cheapest), last model is the fallback (should be best).

`routing.cascade.escalateOn` ("uncertainty" | "refusal" | "error")

"uncertainty" — Escalate when model says "I'm not sure", etc.
"refusal" — Escalate when model refuses ("I can't help with that")
"error" — Escalate on API errors only

`reliability.cooldowns`

| Setting | Default | Description | |---------|---------|-------------| | enabled | true | Enable provider cooldowns | | allowedFails | 3 | Failures before cooldown | | windowSeconds | 60 | Time window for counting failures | | cooldownSeconds | 120 | How long to cool down |

Environment Variables

| Variable | Description | Required? | |----------|-------------|-----------| | ANTHROPIC_API_KEY | Your Anthropic API key | Only if not using Claude Code auth | | OPENAI_API_KEY | Your OpenAI API key | Only for GPT models | | GEMINI_API_KEY | Your Google API key | Only for Gemini models | | XAI_API_KEY | Your xAI API key | Only for Grok models | | RELAYPLANE_CONFIG_PATH | Custom config location | No (default: ~/.relayplane/config.json) |

API Reference

Endpoints

| Endpoint | Method | Description | |----------|--------|-------------| | /v1/messages | POST | Anthropic native format | | /v1/chat/completions | POST | OpenAI format | | /v1/messages/count_tokens | POST | Count tokens | | /v1/models | GET | List available models | | /control/status | GET | Current config and status | | /control/enable | POST | Enable routing | | /control/disable | POST | Disable routing | | /control/config | POST | Update config | | /health | GET | Health check |

Request Headers

| Header | Description | |--------|-------------| | Authorization | Bearer token (passed through to Anthropic) | | x-api-key | API key (alternative to Authorization) | | X-RelayPlane-Bypass: true | Skip routing for this request | | X-RelayPlane-Model: <model> | Force specific model for this request |

Response Metadata

Successful responses include _relayplane metadata:

{
  "id": "msg_xxx",
  "choices": [...],
  "_relayplane": {
    "runId": "abc123",
    "routedTo": "anthropic/claude-3-haiku-20240307",
    "taskType": "question_answering",
    "confidence": 0.42,
    "durationMs": 653,
    "mode": "auto"
  }
}

Troubleshooting

"Connection refused" errors

Problem: Your tool can't connect to the proxy.

Solutions:

Make sure the proxy is running: relayplane-proxy --port 3001
Check you're using the right port
Verify ANTHROPIC_BASE_URL is set correctly

"Missing authentication" errors

Problem: RelayPlane doesn't have API credentials.

Solutions:

For Claude Code: Make sure you're logged in (claude login)
For API keys: Set ANTHROPIC_API_KEY environment variable
Check the proxy logs for more details

Requests are slow

Problem: Responses take longer than expected.

Causes:

Cascade routing is escalating (check logs for "Escalating from...")
Provider cooldowns (check logs for "cooled down")
Network latency to providers

Solution: Use :cost suffix to skip cascade, or set routing.mode: "standard"

Config changes aren't applying

Problem: You edited the config but nothing changed.

Solutions:

Check for JSON syntax errors in your config file
Look at proxy logs — it should say "Reloaded config from..."
Make sure you're editing the right file (~/.relayplane/config.json)

"Unknown model" errors

Problem: RelayPlane doesn't recognize a model name.

Solutions:

Check spelling — model names are exact (e.g., claude-3-5-sonnet-20241022)
Use relayplane:auto instead of specific models
Check provider API key is set for that model's provider

Getting Help

GitHub Issues: github.com/RelayPlane/proxy/issues
Documentation: relayplane.com/docs

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@relayplane/proxy

Table of Contents

What Does This Do?

The Problem

The Solution

Key Features

Quick Start

Step 1: Install

Step 2: Start the Proxy

Step 3: Point Your Tools at the Proxy

Step 4: Use Your Tools Normally

Security

Local-Only by Default

What RelayPlane Has Access To

What RelayPlane Does NOT Do

If You Expose It to the Network

Security Hardening Checklist

How It Works

The Request Flow

Routing Modes

Model Suffixes

Cascade Routing (Default)

Provider Cooldowns

Configuration

Config File Location

Full Config Example

Config Options Explained

enabled (boolean)

modelOverrides (object)

routing.mode ("cascade" | "standard")

routing.cascade.models (array)

routing.cascade.escalateOn ("uncertainty" | "refusal" | "error")

reliability.cooldowns

Environment Variables

API Reference

Endpoints

Request Headers

Response Metadata

Troubleshooting

"Connection refused" errors

"Missing authentication" errors

Requests are slow

Config changes aren't applying

"Unknown model" errors

Getting Help

License

`enabled` (boolean)

`modelOverrides` (object)

`routing.mode` ("cascade" | "standard")

`routing.cascade.models` (array)

`routing.cascade.escalateOn` ("uncertainty" | "refusal" | "error")

`reliability.cooldowns`