pi-ollama-keyring
v1.0.0
Published
Ollama cloud provider for pi-coding-agent with multi-key rotation, live model discovery, and persistent key-pool management
Maintainers
Readme
pi-ollama-keyring
Ollama cloud provider for pi-coding-agent with multi-key rotation, live model discovery, and persistent key-pool management.
What makes this different
The official @0xkobold/pi-ollama package supports a single API key and a stale hardcoded model list.
pi-ollama-keyring was built for people running multiple Ollama cloud API keys who need:
- a live model inventory pulled directly from their account
- automatic key rotation when a key hits quota or rate limits
- manual key selection that survives restarts
- full tool-calling and thinking support through Pi
Installation
pi install npm:pi-ollama-keyringOr add to your global ~/.pi/agent/settings.json:
{
"packages": ["pi-ollama-keyring"]
}Configuration
Add your keys and settings to ~/.pi/agent/settings.json:
{
"pi-ollama": {
"cloudUrl": "https://ollama.com",
"apiKeys": [
"key-1",
"key-2",
"key-3"
],
"activeKeyIndex": 0
}
}A single key also works:
{
"pi-ollama": {
"apiKey": "your-key-here"
}
}Environment variables
export OLLAMA_API_KEY="your-key"
export OLLAMA_API_KEYS="key-1,key-2,key-3"
export OLLAMA_HOST_CLOUD="https://ollama.com"
export OLLAMA_ACTIVE_KEY_INDEX="0"Config precedence (highest to lowest)
- Environment variables
pi.settingsruntime API.pi/settings.json(project-local)~/.pi/agent/settings.json(global)
Features
- ☁️ Ollama Cloud — talks directly to
api.ollama.comvia theollamanpm client - 🔑 Multi-key pool — load as many API keys as you have
- 🔁 Auto-rotation — on quota, rate-limit, or auth failure the key pool is walked until a working key is found
- 💾 Persistent rotation — manual key changes write
activeKeyIndexback to your settings file so the active key survives restarts - 🎯 Live model discovery — model list is fetched from your actual Ollama cloud account at startup, not a stale hardcoded list
- 📊 Accurate context lengths — uses
/api/showfor real context window sizes - 👁️ Vision detection — detected from capabilities and model name
- 🧠 Thinking/reasoning — native support for models that expose a thinking block
- 🛠️ Tool calling — Pi tools work end-to-end through the native Ollama chat stream
- 🔄 Manual refresh — re-fetch the cloud model list any time with
/ollama-refresh
Commands
| Command | Description |
|---------|-------------|
| /ollama-status | Show cloud connection status and active key |
| /ollama-models | List all discovered cloud models |
| /ollama-refresh | Re-fetch the live model inventory |
| /ollama-info MODEL | Show details for a specific model |
| /ollama-keys | Show masked key-pool status |
| /ollama-rotate | Rotate to the next key and persist the change |
| /ollama-use-key N | Activate a specific key by 1-based index and persist |
How key rotation works
Keys are stored in an ordered pool. On startup, the active key starts at activeKeyIndex (default 0).
Automatic rotation happens during any cloud request when a response matches a quota or rate-limit pattern:
- HTTP 429, 402, 401, 403, 5xx
- Body containing
quota,rate limit,unauthorized,payment required, etc.
The pool is walked from the current key through all remaining keys until a working one is found. If all keys are exhausted the error is surfaced to Pi.
Manual rotation via /ollama-rotate or /ollama-use-key N additionally writes activeKeyIndex back to ~/.pi/agent/settings.json so the choice persists across restarts.
Provider name
Models are registered under provider ollama-cloud:
ollama-cloud/minimax-m2.7
ollama-cloud/kimi-k2.5
ollama-cloud/gpt-oss:20bUse them in Pi like any other provider:
pi --provider ollama-cloud --model gpt-oss:20bOr add to enabledModels for Ctrl+P cycling:
{
"enabledModels": [
"ollama-cloud/minimax-m2.7",
"ollama-cloud/kimi-k2.5",
"ollama-cloud/gpt-oss:20b"
]
}Model display
Each model shows source, vision, and reasoning badges plus context size:
☁️ 👁️ 🧠 kimi-k2.5 · 1.04T · 262K ctx
☁️ 🧠 minimax-m2.7 · 230B · 200K ctx
☁️ gpt-oss:20b · 13.8B · 128K ctxLicense
MIT
