mcp-rate-limiter
v1.0.0
Published
Rate limiting proxy for MCP servers — sits between client and server, applies token bucket rate limits to tool calls
Downloads
129
Maintainers
Readme
mcp-rate-limiter
Rate limiting proxy for MCP (Model Context Protocol) servers. Sits transparently between your MCP client and server, applying configurable token bucket rate limits to tool calls while passing all other messages through untouched.
Features
- Token bucket algorithm — smooth rate limiting with burst allowance
- Per-tool limits — different limits for different tools
- Global limits — overall cap across all tool calls
- Request queuing — requests wait instead of being rejected (configurable queue size)
- Transparent proxy — non-tool-call messages pass through untouched
- Retry-after hints — rate limit errors include retry timing
- Zero runtime dependencies overhead — pure Node.js stdio proxy
Installation
npm install -g mcp-rate-limiter
# or use without installing:
npx mcp-rate-limiter --helpUsage
With a YAML config file
npx mcp-rate-limiter --config rate-limit.yamlQuick global limit
npx mcp-rate-limiter --command "node my-mcp-server.js" --global-rps 10Wrap any MCP server
npx mcp-rate-limiter \
--command "npx @modelcontextprotocol/server-memory" \
--global-rps 5 \
--burst 10 \
--verboseConfig Format
# rate-limit.yaml
command: "node my-mcp-server.js"
args:
- "--port"
- "3000"
global:
rps: 10 # requests per second
burst: 20 # bucket capacity (max burst)
queueSize: 50 # queue up to N requests instead of rejecting
tools:
expensive_tool:
rpm: 5 # requests per minute
burst: 2
fast_tool:
rps: 50
burst: 100
default: # fallback for unlisted tools
rps: 10
burst: 15
queueSize: 10CLI Options
| Flag | Description |
|------|-------------|
| -c, --config <file> | Path to YAML config file |
| --command <cmd> | MCP server command to wrap |
| --global-rps <n> | Global requests per second |
| --global-rpm <n> | Global requests per minute |
| --burst <n> | Burst allowance |
| --queue-size <n> | Max queued requests (default: 0 = reject) |
| -v, --verbose | Verbose logging to stderr |
How it Works
MCP Client (stdin/stdout)
│
▼
mcp-rate-limiter (this tool)
- reads JSON-RPC from stdin
- intercepts tools/call messages
- checks token bucket rate limits
- queues or rejects if over limit
- forwards allowed calls to child process
│
▼
MCP Server (child process)Non-tool-call messages (initialize, ping, prompts/list, resources/read, etc.) pass through immediately without any rate limiting.
Rate limit exceeded response:
{
"jsonrpc": "2.0",
"id": 1,
"error": {
"code": -32029,
"message": "Rate limit exceeded",
"data": {
"retryAfter": 2,
"retryAfterMs": 1800,
"hint": "Retry after 2s"
}
}
}Token Bucket Algorithm
Each bucket tracks:
- tokens — available request slots (refilled over time)
- capacity (burst) — maximum tokens, allows short bursts
- rate — tokens added per second (rps or rpm/60)
On each tool call:
- Refill tokens based on elapsed time
- If token available → consume and allow
- If no token + queue space → enqueue request, resolve when token available
- If no token + queue full → reject with retry-after hint
Use in Claude Desktop / Cursor / Other MCP Clients
Instead of pointing directly at your server, point at mcp-rate-limiter:
{
"mcpServers": {
"my-server": {
"command": "npx",
"args": [
"mcp-rate-limiter",
"--config", "/path/to/rate-limit.yaml"
]
}
}
}License
MIT
