nvidia-glm-proxy

v2.0.0

Published

16 days ago

Proxy for NVIDIA NIM API that fixes GLM-5.1 tool_call streaming bugs (numeric IDs, missing function.name, ID instability)

Downloads

335

0High
0Medium
0Low

diegolopez02081

nvidia nim glm proxy tool-calls streaming openai-compatible mcp

nvidia-glm-proxy

A lightweight reverse proxy for the NVIDIA NIM API that fixes GLM-5.1 tool_call streaming bugs. Zero dependencies.

Problem

NVIDIA's GLM-5.1 model via NIM has several bugs when streaming tool calls:

Numeric id — tool call IDs are returned as numbers instead of strings (e.g. "id": 1 instead of "id": "call_abc123")
Missing id — tool call IDs are sometimes omitted entirely
Missing function.name — the function name inside tool calls is sometimes null or missing
ID instability — the same tool call index gets different IDs across SSE chunks, breaking accumulation

These bugs cause OpenAI-compatible clients (like opencode, Claude, etc.) to crash or misinterpret tool calls.

Solution

nvidia-glm-proxy sits between your client and integrate.api.nvidia.com, patching responses in real-time:

| Bug | Fix | |---|---| | Numeric id | Converts to call_<number> string format | | Missing id | Generates a stable call_<uuid> | | Missing function.name | Infers from tool definitions + user message + tool_choice | | Unstable IDs across chunks | Stabilizes IDs per index using a chunk map |

Install

npm (global)

npm install -g nvidia-glm-proxy

From source

git clone https://github.com/DiegoLopez0208/nvidia-glm-proxy.git
cd nvidia-glm-proxy

Configuration

Copy .env.example to .env and fill in your values:

cp .env.example .env

| Variable | Default | Description | |---|---|---| | NVIDIA_API_KEY | (empty) | Your NVIDIA NIM API key. If set, the proxy injects it as Authorization: Bearer when the client doesn't provide one | | NVIDIA_NIM_HOST | integrate.api.nvidia.com | Upstream NVIDIA NIM host | | NVIDIA_NIM_PORT | 443 | Upstream port | | PROXY_PORT | 9999 | Local port the proxy listens on | | UPSTREAM_TIMEOUT | 120000 | Upstream request timeout in ms | | BIND_ADDRESS | 127.0.0.1 | Address to bind to |

Usage

Start the proxy

node proxy.js

Or with the systemd user service:

bash install.sh

Update your client config

Point your OpenAI-compatible client at the proxy instead of NVIDIA directly:

{
  "provider": {
    "nvidia-proxy": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://127.0.0.1:9999/v1",
        "apiKey": "sk-proxy"
      },
      "models": {
        "z-ai/glm-5.1": { "name": "z-ai/glm-5.1" }
      }
    }
  }
}

The apiKey value doesn't matter if NVIDIA_API_KEY is set in .env — the proxy will inject the real key automatically.

Health check

curl http://127.0.0.1:9999/health

How function.name inference works

When GLM-5.1 omits function.name from a tool call, the proxy tries to infer it:

tool_choice — if the request forces a specific function, use that
User message match — if the user message contains a no-param tool name, use that
Keyword scoring — match user message words against tool descriptions and name parts
Single tool fallback — if only one tool is available, use it
Parameter signature matching — match argument keys against tool parameter schemas
Fallback — "unknown" if nothing matches

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

nvidia-glm-proxy

Problem

Solution

Install

npm (global)

From source

Configuration

Usage

Start the proxy

Update your client config

Health check

How function.name inference works

License