@crazy-goat/nexos-provider
v1.11.0
Published
Custom AI SDK provider for nexos.ai models (Gemini, Claude, GPT, Codex, Codestral, Kimi) in opencode
Maintainers
Readme
nexos-provider
Custom AI SDK provider for using nexos.ai models with opencode.
What it does
Fixes compatibility issues when using Gemini, Claude, ChatGPT, Codex, and Codestral models through nexos.ai API in opencode:
- Gemini: appends missing
data: [DONE]SSE signal (prevents hanging), inlines$refin tool schemas (rejected by Vertex AI), fixesfinish_reasonfor tool calls (stop→tool_calls) - Claude: converts thinking params to snake_case (
budgetTokens→budget_tokens), fixesfinish_reason(end_turn→stop, prevents infinite retry loop), stripsthinkingobject when disabled, addscache_controlmarkers for prompt caching, stripstemperaturewhen thinking is enabled - ChatGPT/GPT: strips
reasoning_effort: "none"(unsupported), stripstemperature: false(invalid value), strips temperature for non-Codex models (nexos.ai chat completions only supports default temperature; Codex models via Responses API support custom temperature) - Codex: transparently redirects requests to
/v1/responses(Responses API) — Codex models don't support/v1/chat/completions. Handles streaming, tool calls, reasoning effort, and cache token reporting. - Codestral: sets
strict: falsein tool definitions whenstrictisnull(Mistral API rejectsnullfor this field)
Setup
1. Set your API key
export NEXOS_API_KEY="your-nexos-api-key"2. Configure opencode
Add the provider to your ~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"nexos-ai": {
"npm": "@crazy-goat/nexos-provider",
"name": "Nexos AI",
"env": ["NEXOS_API_KEY"],
"options": {
"baseURL": "https://api.nexos.ai/v1/",
"timeout": 300000
},
"models": {
"Gemini 2.5 Pro": {
"name": "Gemini 2.5 Pro",
"limit": { "context": 128000, "output": 64000 }
},
"Claude Sonnet 4.5": {
"name": "Claude Sonnet 4.5",
"limit": { "context": 200000, "output": 16000 },
"options": {
"thinking": { "type": "enabled", "budgetTokens": 1024 }
},
"variants": {
"thinking-high": { "thinking": { "type": "enabled", "budgetTokens": 10000 } },
"no-thinking": { "thinking": { "type": "disabled" } }
}
},
"GPT 5": {
"name": "GPT 5",
"limit": { "context": 400000, "output": 128000 },
"options": { "reasoningEffort": "medium" },
"variants": {
"high": { "reasoningEffort": "high" },
"no-reasoning": { "reasoningEffort": "none" }
}
}
}
}
}
}Tip: You can automatically generate the config with all available nexos.ai models using opencode-nexos-models-config.
Warning: Gemini 3 models (Flash Preview, Pro Preview) do not work with tool calling through nexos.ai — see known-bugs/gemini3-tools for details.
3. Use it
Simple prompt:
opencode run "hello" -m "nexos-ai/Gemini 2.5 Pro"With tool calling:
opencode run "list files in current directory" -m "nexos-ai/Gemini 2.5 Pro"Claude with thinking:
opencode run "what is 2+2?" -m "nexos-ai/Claude Sonnet 4.5" --variant thinking-highGPT with reasoning effort:
opencode run "what is 2+2?" -m "nexos-ai/GPT 5" --variant highOr select the model interactively in opencode with Ctrl+X M.
Updating
opencode caches the provider in ~/.cache/opencode/. To force an update to the latest version:
rm -rf ~/.cache/opencode/node_modules/@crazy-goatThe next time you run opencode, it will download the latest version from npm.
How it works
The provider exports createNexosAI which creates a standard AI SDK provider with a custom fetch wrapper. Per-provider fixes are in separate modules:
opencode → createNexosAI → fetch wrapper → nexos.ai API
│
├─ fix-gemini.mjs: $ref inlining, finish_reason fix
├─ fix-claude.mjs: thinking params, end_turn→stop
├─ fix-chatgpt.mjs: strips reasoning_effort:"none"
├─ fix-codex.mjs: chat completions → Responses API
└─ fix-codestral.mjs: strict:null→false in toolsTesting
Test with a simple prompt:
opencode run "what is 2+2?" -m "nexos-ai/Gemini 2.5 Pro"
opencode run "what is 2+2?" -m "nexos-ai/Gemini 2.5 Flash"
opencode run "what is 2+2?" -m "nexos-ai/Claude Sonnet 4.5"
opencode run "what is 2+2?" -m "nexos-ai/GPT 5"Test tool calling:
opencode run "list files in current directory" -m "nexos-ai/Gemini 2.5 Pro"
opencode run "list files in current directory" -m "nexos-ai/Claude Sonnet 4.5"
opencode run "list files in current directory" -m "nexos-ai/GPT 5"
opencode run "list files in current directory" -m "nexos-ai/GPT 5.3 Codex"Test thinking/reasoning variants:
opencode run "what is 2+2?" -m "nexos-ai/Claude Sonnet 4.5" --variant thinking-high
opencode run "what is 2+2?" -m "nexos-ai/Gemini 2.5 Pro" --variant thinking-high
opencode run "what is 2+2?" -m "nexos-ai/GPT 5" --variant high
opencode run "what is 2+2?" -m "nexos-ai/GPT 5.3 Codex" --variant highAutomated model check
Run check-models/check-all.mjs to test all available models for simple prompts and tool calling:
node check-models/check-all.mjsTest a single model:
node check-models/check-all.mjs "GPT 4.1"Results are saved to check-models/checks.md — see current compatibility status there.
Known Bugs
The known-bugs/ directory contains documentation and test scripts for known API issues:
- token-caching — Gemini implicit caching does not do prefix matching (only caches identical requests). Claude and GPT prefix caching works correctly. Gemini explicit caching works but nexos.ai does not expose the
cachedContentsAPI. - gemini3-tools — Gemini 3 models (Flash Preview, Pro Preview) fail on multi-turn tool calling due to missing
thought_signaturesupport in nexos.ai API - thinking — Test script for thinking/reasoning blocks across models
License
MIT
