qw-proxy

v1.0.7

Published

2 months ago

HTTP proxy bridge to AgentRouter API via qwen-code

0High
0Medium
0Low

infuwefojhbwefdfgw

agentrouter proxy llm deepseek openai-compatible

qw-proxy

HTTP proxy bridge to AgentRouter API via qwen-code SDK.

Allows you to access AgentRouter models through a simple REST API with minimal token overhead (~21 input tokens vs 12k+ default).

Installation

npm install -g qw-proxy

Quick Start

# Basic usage (12k+ input tokens)
AGENTROUTER_API_KEY=sk-xxx qw-proxy

# Optimized usage (~21 input tokens)
mkdir -p .qwen && echo "You are a helpful assistant." > .qwen/system.md
QWEN_SYSTEM_MD=1 AGENTROUTER_API_KEY=sk-xxx qw-proxy

Server starts on http://localhost:3001

Supported Models

| Model | Status | Notes | |-------|--------|-------| | deepseek-v3.2 | ✅ | Default, recommended | | gpt-5.2 | ✅ | GPT-5.2 via AgentRouter | | claude-haiku-4-5-20251001 | ✅ | Fast Claude model | | claude-sonnet-4-5-20250929 | ✅ | Claude Sonnet 4.5 | | glm-4.5 | ✅ | GLM model | | glm-4.6 | ✅ | GLM model |

Set model via AGENTROUTER_MODEL environment variable.

Environment Variables

| Variable | Required | Default | Description | |----------|----------|---------|-------------| | AGENTROUTER_API_KEY | Yes | - | Your AgentRouter API key | | AGENTROUTER_MODEL | No | deepseek-v3.2 | Model to use | | AGENTROUTER_BASE_URL | No | https://agentrouter.org/v1 | API base URL | | PORT | No | 3001 | Server port | | QWEN_SYSTEM_MD | No | - | Set to 1 for minimal tokens |

API Reference

POST /chat

Send a message and get AI response.

Request:

{
  "user_id": "user123",
  "message": "Hello!",
  "images": ["/path/to/image.png"]
}

| Field | Type | Required | Description | |-------|------|----------|-------------| | user_id | string | Yes | Unique user ID for conversation history | | message | string | Yes | User message | | images | string[] | No | Local image paths (vision models) |

Response:

{
  "success": true,
  "response": "Hello! How can I help you?",
  "history_length": 3
}

Example:

curl http://localhost:3001/chat \
  -H "Content-Type: application/json" \
  -d '{"user_id": "test", "message": "Hello!"}'

GET /health

Health check.

{
  "status": "ok",
  "model": "deepseek-v3.2",
  "active_users": 1,
  "active_clients": 1
}

POST /clear

Clear conversation history.

curl -X POST http://localhost:3001/clear \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user123"}'

POST /cancel/:user_id

Cancel ongoing generation.

curl -X POST http://localhost:3001/cancel/user123

GET /api/compression/stats

Get conversation compression statistics.

Features

Minimal tokens: ~21 input tokens with QWEN_SYSTEM_MD=1 (vs 12k+ default)
Per-user history: Isolated conversation contexts
Auto-compression: LLM-based history compression when limit exceeded
Vision support: Send images with messages
Request cancellation: Cancel ongoing generations
30s timeout: Protection against hanging requests

How It Works

Your App → qw-proxy (localhost:3001) → qwen-code SDK → AgentRouter API → LLM

qwen-code SDK handles authentication with AgentRouter. This proxy exposes it as a simple REST API.

Token Optimization

Default qwen-code includes ~12k tokens of system prompts, tools, and environment context.

To reduce to ~21 tokens:

Create .qwen/system.md with minimal prompt
Set QWEN_SYSTEM_MD=1

The package has patched qwen-code to disable tools and environment context injection.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

qw-proxy

Installation

Quick Start

Supported Models

Environment Variables

API Reference

POST /chat

GET /health

POST /clear

POST /cancel/:user_id

GET /api/compression/stats

Features

How It Works

Token Optimization

License