qw-proxy
v1.0.7
Published
HTTP proxy bridge to AgentRouter API via qwen-code
Maintainers
Readme
qw-proxy
HTTP proxy bridge to AgentRouter API via qwen-code SDK.
Allows you to access AgentRouter models through a simple REST API with minimal token overhead (~21 input tokens vs 12k+ default).
Installation
npm install -g qw-proxyQuick Start
# Basic usage (12k+ input tokens)
AGENTROUTER_API_KEY=sk-xxx qw-proxy
# Optimized usage (~21 input tokens)
mkdir -p .qwen && echo "You are a helpful assistant." > .qwen/system.md
QWEN_SYSTEM_MD=1 AGENTROUTER_API_KEY=sk-xxx qw-proxyServer starts on http://localhost:3001
Supported Models
| Model | Status | Notes |
|-------|--------|-------|
| deepseek-v3.2 | ✅ | Default, recommended |
| gpt-5.2 | ✅ | GPT-5.2 via AgentRouter |
| claude-haiku-4-5-20251001 | ✅ | Fast Claude model |
| claude-sonnet-4-5-20250929 | ✅ | Claude Sonnet 4.5 |
| glm-4.5 | ✅ | GLM model |
| glm-4.6 | ✅ | GLM model |
Set model via AGENTROUTER_MODEL environment variable.
Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| AGENTROUTER_API_KEY | Yes | - | Your AgentRouter API key |
| AGENTROUTER_MODEL | No | deepseek-v3.2 | Model to use |
| AGENTROUTER_BASE_URL | No | https://agentrouter.org/v1 | API base URL |
| PORT | No | 3001 | Server port |
| QWEN_SYSTEM_MD | No | - | Set to 1 for minimal tokens |
API Reference
POST /chat
Send a message and get AI response.
Request:
{
"user_id": "user123",
"message": "Hello!",
"images": ["/path/to/image.png"]
}| Field | Type | Required | Description |
|-------|------|----------|-------------|
| user_id | string | Yes | Unique user ID for conversation history |
| message | string | Yes | User message |
| images | string[] | No | Local image paths (vision models) |
Response:
{
"success": true,
"response": "Hello! How can I help you?",
"history_length": 3
}Example:
curl http://localhost:3001/chat \
-H "Content-Type: application/json" \
-d '{"user_id": "test", "message": "Hello!"}'GET /health
Health check.
{
"status": "ok",
"model": "deepseek-v3.2",
"active_users": 1,
"active_clients": 1
}POST /clear
Clear conversation history.
curl -X POST http://localhost:3001/clear \
-H "Content-Type: application/json" \
-d '{"user_id": "user123"}'POST /cancel/:user_id
Cancel ongoing generation.
curl -X POST http://localhost:3001/cancel/user123GET /api/compression/stats
Get conversation compression statistics.
Features
- Minimal tokens: ~21 input tokens with
QWEN_SYSTEM_MD=1(vs 12k+ default) - Per-user history: Isolated conversation contexts
- Auto-compression: LLM-based history compression when limit exceeded
- Vision support: Send images with messages
- Request cancellation: Cancel ongoing generations
- 30s timeout: Protection against hanging requests
How It Works
Your App → qw-proxy (localhost:3001) → qwen-code SDK → AgentRouter API → LLMqwen-code SDK handles authentication with AgentRouter. This proxy exposes it as a simple REST API.
Token Optimization
Default qwen-code includes ~12k tokens of system prompts, tools, and environment context.
To reduce to ~21 tokens:
- Create
.qwen/system.mdwith minimal prompt - Set
QWEN_SYSTEM_MD=1
The package has patched qwen-code to disable tools and environment context injection.
License
MIT
