claude-code-ollama
v0.1.0
Published
A Claude Code skill that routes inference to a locally running Ollama model or Ollama Cloud model.
Maintainers
Readme
claude-code-ollama
Run Claude Code with any open-source model via Ollama. One script. Zero dependencies.
Quick Start
bash scripts/setup.sh glm-4.7-flash # local
bash scripts/setup.sh deepseek-v4-pro:cloud # cloud
bash scripts/setup.sh glm-4.7-flash --launch # auto-open new terminalWhat It Does
1. Detect OS (macOS / Linux / WSL)
2. Install Ollama if missing
3. Upgrade if below v0.14.0
4. Warn if your RAM can't handle the model
5. Pull the model if not present
6. Start Ollama server if not running
7. Print launch command (or --launch to auto-open a new terminal)How It Works
Claude Code is an agent loop. The model is swappable. This script sets three env vars that point Claude Code at Ollama instead of Anthropic's API:
Claude Code CLI → localhost:11434 (Ollama) → Model (local or cloud)Model Recommendations
Local (runs on your hardware)
| RAM | Model | Why |
|--------|------------------------|------------------------------------|
| 8GB | qwen3.5:7b | Fits tight |
| 16GB | qwen2.5-coder:14b | Best coding bang per GB |
| 32GB | glm-4.7-flash | 128K context + native tool calling |
| 64GB+ | qwen3-coder:30b-a3b | MoE — 30B total, 3B active |
Cloud (runs on Ollama's servers, :cloud suffix)
| Model | Strength |
|--------------------------|-----------------------------------|
| deepseek-v4-pro:cloud | Top coding benchmarks, 1M context |
| kimi-k2.5:cloud | Long-horizon agentic work |
| glm-5:cloud | Best open-weight on SWE-Bench Pro |
Terminal Support (--launch)
Auto-detects and opens a new tab/window:
- macOS: iTerm2 → WezTerm → Terminal.app
- Linux: gnome-terminal → xterm → konsole
- WSL: Windows Terminal → cmd.exe
Falls back to printing the command if detection fails.
Troubleshooting
Invalid discriminator value error — You're probably pointing Claude Code at an OpenAI-compatible API (like LM Studio) instead of Ollama. Ollama v0.14.0+ has native Anthropic API support. Use that.
Model runs slow — Try a quantized variant: ollama pull <model>:q4_K_M. Or switch to a :cloud model.
Out of memory — The script warns you before pulling. If you ignored it... try :cloud.
License
MIT
