ultra-lean-mcp-proxy
v0.3.2
Published
Ultra Lean MCP Proxy - lightweight optimization proxy for MCP (Node.js)
Maintainers
Readme
ultra-lean-mcp-proxy
Transparent MCP stdio proxy that reduces token and byte overhead on tools/list and tools/call paths using LAP (Lean Agent Protocol) compression.
One-Line Install
Python (pip)
pip install ultra-lean-mcp-proxy
ultra-lean-mcp-proxy installNode.js (npx - zero Python dependency)
npx ultra-lean-mcp-proxy installBoth commands auto-discover local MCP client configs (Claude Desktop, Cursor, Windsurf, Claude Code), wrap stdio and URL (http/sse) entries by default, and back up originals.
To uninstall:
ultra-lean-mcp-proxy uninstallTo check current status:
ultra-lean-mcp-proxy statusAdd Servers That Get Wrapped
ultra-lean-mcp-proxy install wraps local stdio servers (command + args) and local URL-based transports (http / sse) by default.
Use --no-wrap-url if you only want stdio wrapping.
For Claude Code, add servers in stdio form and use --scope user so they are written to ~/.claude.json (auto-detected):
# Wrappable (stdio)
claude mcp add --scope user filesystem -- npx -y @modelcontextprotocol/server-filesystem /tmp# Wrappable by default (wrapped via local bridge chain)
claude mcp add --scope user --transport http linear https://mcp.linear.app/mcpThen run:
ultra-lean-mcp-proxy status
ultra-lean-mcp-proxy installNote:
claude mcp add --scope project ...writes to.mcp.jsonin the current project. This file is not globally auto-discovered byinstallyet.
Note: URL wrapping applies to local config files (for example
~/.claude.json,~/.cursor/mcp.json).
For cloud-managed Claude connectors, use npm CLIwrap-cloudto mirror and wrap them locally:npx ultra-lean-mcp-proxy wrap-cloud
Features
- Transparent Proxying: Wrap any MCP stdio server without code changes
- Massive Token Savings: 51-83% token reduction across real MCP servers
- Performance Boost: 22-87% faster response times
- Zero Client Changes: Compatible with existing MCP clients
- Tools Hash Sync: Efficient tool list caching with conditional requests
- Delta Responses: Send only changes between responses
- Lazy Loading: On-demand tool discovery for large tool sets
- Result Compression: Compress tool call results using LAP format
Performance Benchmarks
Benchmark figures below are for the Python runtime with the full v2 optimization pipeline enabled. The npm package in Phase C1 currently provides definition compression only.
Real-world benchmark across 5 production MCP servers (147 measured turns):
| Metric | Direct | With Proxy | Savings | |--------|--------|------------|---------| | Total Tokens | 82,631 | 23,826 | 71.2% | | Response Time | 1,047ms | 540ms | 48.4% |
Per-Server Results
| Server | Token Savings | Time Savings | Tools | |--------|---------------|--------------|-------| | filesystem | 72.4% | 87.3% | list_directory, search_files | | memory | 82.7% | 31.8% | read_graph, search_nodes | | everything | 65.2% | 22.1% | get-resource-links, research | | sequential-thinking | 61.5% | 3.8% | sequentialthinking | | puppeteer | 51.2% | -9.7% | puppeteer_navigate, evaluate |
Note: Puppeteer showed time overhead due to heavy I/O operations, but still achieved 51% token savings.
Installation
Basic Installation
pip install ultra-lean-mcp-proxyWith Proxy Support (Recommended)
pip install 'ultra-lean-mcp-proxy[proxy]'Development Installation
pip install 'ultra-lean-mcp-proxy[dev]'Quick Start
Wrap Any MCP Server
# Wrap the filesystem server
ultra-lean-mcp-proxy proxy -- npx -y @modelcontextprotocol/server-filesystem /tmp
# Wrap a Python MCP server
ultra-lean-mcp-proxy proxy -- python -m my_mcp_server
# Wrap with runtime stats
ultra-lean-mcp-proxy proxy --stats -- npx -y @modelcontextprotocol/server-memory
# Enable verbose logging
ultra-lean-mcp-proxy proxy -v -- npx -y @modelcontextprotocol/server-everythingClaude Desktop Integration
Update your claude_desktop_config.json:
{
"mcpServers": {
"filesystem-optimized": {
"command": "ultra-lean-mcp-proxy",
"args": [
"proxy",
"--stats",
"--",
"npx",
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/yourname/Documents"
]
}
}
}Now when Claude uses the filesystem server, all communication is automatically optimized.
Configuration
Command-Line Flags
# All optimization vectors are ON by default.
# Use --disable-* flags to opt out.
ultra-lean-mcp-proxy proxy \
--disable-lazy-loading \
-- <upstream-command>
# Fine-tune optimization parameters
ultra-lean-mcp-proxy proxy \
--result-compression-mode aggressive \
--lazy-mode search_only \
--cache-ttl 3600 \
--delta-min-savings 0.15 \
-- <upstream-command>
# Dump effective configuration
ultra-lean-mcp-proxy proxy --dump-effective-config -- <upstream-command>Configuration File
Create ultra-lean-mcp-proxy.config.json or .yaml:
{
"result_compression_enabled": true,
"result_compression_mode": "aggressive",
"delta_responses_enabled": true,
"lazy_loading_enabled": true,
"lazy_mode": "search_only",
"tools_hash_sync_enabled": true,
"caching_enabled": true,
"cache_ttl_seconds": 3600
}Load with:
ultra-lean-mcp-proxy proxy --config ultra-lean-mcp-proxy.config.json -- <upstream-command>Environment Variables
Prefix any config option with ULTRA_LEAN_MCP_PROXY_:
export ULTRA_LEAN_MCP_PROXY_RESULT_COMPRESSION_ENABLED=true
export ULTRA_LEAN_MCP_PROXY_CACHE_TTL_SECONDS=3600
ultra-lean-mcp-proxy proxy -- <upstream-command>Optimization Features
1. Tool Definition Compression
Compresses tools/list responses using LAP format:
Before (JSON Schema):
{
"name": "search_files",
"description": "Search for files matching a pattern",
"inputSchema": {
"type": "object",
"properties": {
"pattern": {"type": "string", "description": "Glob pattern"},
"max_results": {"type": "number", "default": 100}
},
"required": ["pattern"]
}
}After (LAP):
@tool search_files
@desc Search for files matching a pattern
@in pattern:string Glob pattern
@opt max_results:number=1002. Tools Hash Sync
Efficient caching using conditional requests:
- Client: "Give me tools if hash != abc123"
- Server (unchanged):
304 Not Modified - Server (changed):
200 OKwith new tools
Hit ratio in benchmarks: 84.1% (37 hits, 7 misses)
3. Delta Responses
Send only changes between tool calls:
First call:
{"status": "running", "progress": 0, "message": "Starting..."}Second call (delta):
{"progress": 50, "message": "Processing..."}4. Lazy Loading
Load tools on-demand instead of all at once:
- Off: All tools sent upfront
- Minimal: Send 5 most-used tools initially
- Search Only: Only send search/discovery tools, load others when called
Best for servers with 20+ tools.
5. Result Compression
Compress tool call results:
- Balanced: Compress descriptions, preserve structure
- Aggressive: Maximum compression, lean LAP format
CLI Reference
Install / Uninstall
# Install: wrap all MCP servers with proxy
ultra-lean-mcp-proxy install [--dry-run] [--client NAME] [--skip NAME] [--offline] [--no-wrap-url] [--no-cloud] [--suffix NAME] [-v]
# `--skip` matches MCP server names inside config files
# Uninstall: restore original configs
ultra-lean-mcp-proxy uninstall [--dry-run] [--client NAME] [--runtime pip|npm] [--all] [-v]
# Check status
ultra-lean-mcp-proxy status
# Mirror cloud-scoped Claude URL connectors into local wrapped entries (npm CLI)
npx ultra-lean-mcp-proxy wrap-cloud [--dry-run] [--runtime npm|pip] [--suffix -ulmp] [-v]Watch Mode (Auto-Update)
# Watch config files, auto-wrap new servers
ultra-lean-mcp-proxy watch
# Watch but keep URL/SSE/HTTP entries unwrapped
ultra-lean-mcp-proxy watch --no-wrap-url
# Run as background daemon
ultra-lean-mcp-proxy watch --daemon
# Stop daemon
ultra-lean-mcp-proxy watch --stop
# Set cloud connector discovery interval (default: 60s)
ultra-lean-mcp-proxy watch --cloud-interval 30
# Customize suffix for cloud-mirrored entries
ultra-lean-mcp-proxy watch --suffix -proxyWatch mode auto-discovers cloud-scoped Claude MCP connectors when the claude CLI is available on PATH, polling every --cloud-interval seconds.
Proxy (Direct Usage)
ultra-lean-mcp-proxy proxy [--enable-<feature>|--disable-<feature>] [--cache-ttl SEC] [--lazy-mode MODE] -- <upstream-command> [args...]For troubleshooting, you can enable per-server RPC tracing:
ultra-lean-mcp-proxy proxy --trace-rpc -- <upstream-command>Architecture
┌──────────┐ ┌────────────────────┐ ┌──────────┐
│ │ stdio │ ultra-lean-mcp │ stdio │ Upstream │
│ Client │◄─────────►│ proxy │◄─────────►│ MCP │
│ (Claude) │ │ │ │ Server │
│ │ LAP │ ┌──────────────┐ │ JSON │ │
└──────────┘ │ │ Compression │ │ └──────────┘
│ │ Delta Engine │ │
│ │ Cache Layer │ │
│ │ Lazy Loader │ │
│ └──────────────┘ │
└────────────────────┘The proxy:
- Sits between client and server as transparent stdio relay
- Intercepts
tools/listandtools/callJSON-RPC messages - Compresses outgoing responses using LAP format
- Decompresses incoming requests back to JSON Schema
- Maintains delta state, cache, and tool registry
Use Cases
Production MCP Servers
Wrap existing MCP servers to reduce LLM token costs and improve response times.
High-Volume Tool Servers
Servers with 50+ tools benefit from lazy loading and tools hash sync.
Low-Bandwidth Environments
Reduce network payload sizes by 50-70% with compression.
Development & Testing
Run with --stats to understand token usage patterns and optimization effectiveness.
Monitoring & Stats
Enable stats logging:
ultra-lean-mcp-proxy proxy --stats -- <upstream-command>Output to stderr:
[2026-02-15 10:28:55] Token savings: 71.2% (82631 → 23826)
[2026-02-15 10:28:55] Time savings: 48.4% (1047ms → 540ms)
[2026-02-15 10:28:55] tools_hash hit ratio: 37:7 (84.1% hits)
[2026-02-15 10:28:55] Upstream traffic: 2858 req tokens, 22528 resp tokensRelated Projects
- ultra-lean-mcp-core - Zero-dependency core library for LAP compilation/decompilation
- ultra-lean-mcp - MCP server + CLI for LAP workflows
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
MIT License - see LICENSE for details.
Part of the Lean Agent Protocol ecosystem.
