@elvatis_com/openclaw-gpu-bridge
v0.2.0
Published
OpenClaw plugin - Remote GPU compute via FastAPI microservice (BERTScore, embeddings)
Maintainers
Readme
@elvatis_com/openclaw-gpu-bridge
OpenClaw plugin to offload ML tasks (BERTScore + embeddings) to one or many remote GPU hosts.
v0.2 Highlights
- Multi-GPU host pool (
hosts[]) with:- round-robin or least-busy load balancing
- automatic failover
- periodic host health checks
- Backward compatibility with v0.1 (
serviceUrl/url) - Flexible model selection per request (
model/model_type) - GPU service model caching (on-demand loading)
- Optional transfer visibility via
/statusendpoint + batch progress logs
Tools
gpu_healthgpu_infogpu_status(new in v0.2)gpu_bertscoregpu_embed
OpenClaw Plugin Config
v0.2 (recommended)
{
"plugins": {
"@elvatis_com/openclaw-gpu-bridge": {
"hosts": [
{
"name": "rtx-2080ti",
"url": "http://your-gpu-host:8765",
"apiKey": "gpu-key-1"
},
{
"name": "rtx-3090",
"url": "http://your-second-gpu-host:8765",
"apiKey": "gpu-key-2"
}
],
"loadBalancing": "least-busy",
"healthCheckIntervalSeconds": 30,
"timeout": 45,
"models": {
"embed": "all-MiniLM-L6-v2",
"bertscore": "microsoft/deberta-xlarge-mnli"
}
}
}
}v0.1 compatibility
{
"plugins": {
"@elvatis_com/openclaw-gpu-bridge": {
"serviceUrl": "http://your-gpu-host:8765",
"apiKey": "gpu-key",
"timeout": 45
}
}
}Config reference
hosts: array of GPU hosts (v0.2)serviceUrl/url: legacy single-host configloadBalancing:round-robinorleast-busyhealthCheckIntervalSeconds: host health polling intervaltimeout: request timeout for compute endpointsapiKey: fallback API key for hosts that do not define per-host keymodels.embed,models.bertscore: plugin-side default models
GPU Service (Python) Setup
cd gpu-service
pip install -r requirements.txt
uvicorn gpu_service:app --host 0.0.0.0 --port 8765Default models are warmed on startup:
- Embed:
all-MiniLM-L6-v2 - BERTScore:
microsoft/deberta-xlarge-mnli
Additional models are loaded on-demand and cached in memory.
Environment variables
API_KEY: requireX-API-Keyfor all endpoints except/healthGPU_MAX_CONCURRENT: max parallel jobs (default2)GPU_EMBED_BATCH: embedding chunk size for progress logging (default32)GPU_MAX_BATCH_SIZE: max items per batch (default100)GPU_MAX_TEXT_LENGTH: max character length per text (default10000)MODEL_BERTSCORE: default warm model for BERTScoreMODEL_EMBED: default warm model for embeddingsTORCH_DEVICE: force device (cuda,cpu,cuda:1)
API Endpoints (GPU Service)
GET /healthGET /infoGET /status(queue + active jobs + progress)POST /bertscorePOST /embed
Request-level model override
/bertscore:
{
"candidates": ["a"],
"references": ["b"],
"model_type": "microsoft/deberta-xlarge-mnli"
}/embed:
{
"texts": ["hello world"],
"model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
}Exposing to the Internet
If you expose your GPU service outside LAN, use defense-in-depth:
Pre-shared key auth (required)
- Set
API_KEYon service - Configure same key in plugin host config (
apiKey) - Requests must include
X-API-Key
- Set
TLS/HTTPS (required on public internet)
- Recommended: nginx reverse proxy with Let’s Encrypt certs
- Alternative: run uvicorn with SSL cert/key directly
nginx reverse proxy example
server {
listen 443 ssl http2;
server_name gpu.example.com;
ssl_certificate /etc/letsencrypt/live/gpu.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/gpu.example.com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:8765;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}uvicorn SSL example
uvicorn gpu_service:app --host 0.0.0.0 --port 8765 \
--ssl-keyfile /path/key.pem \
--ssl-certfile /path/cert.pemOptional: WireGuard VPN instead of public exposure
- Keep service private behind VPN
- Prefer private WireGuard IPs in plugin
hosts[].url
Operational hardening
- Firewall allowlist only OpenClaw server IP
- Rate limiting at reverse proxy
- Monitor logs and rotate keys periodically
Development
npm run build
npm testTypeScript runs in strict mode.
License
MIT
