sage-agent
v0.1.0
Published
Local LLM coding agent CLI
Maintainers
Readme
Sage — Local LLM Coding Agent CLI
A TypeScript CLI coding agent that connects to any OpenAI-compatible local model server and performs coding tasks autonomously. Built for developers who want Claude Code-style agentic workflows powered by locally-hosted LLMs.
What It Does
Sage is an autonomous coding assistant that can:
- Read and edit files in your codebase
- Execute shell commands
- Search for files and patterns
- Understand and respond to natural language requests
- Maintain conversation context across sessions
- Stream responses in real-time
Unlike cloud-based AI assistants, Sage runs entirely on your local infrastructure using models hosted on Ollama, llama.cpp, vLLM, LM Studio, or any other OpenAI-compatible server.
Features
- Tool Calling: Six essential coding tools (read, write, edit, bash, glob, grep)
- Streaming Responses: Real-time token streaming with markdown rendering
- Multi-Model Support: Works with any OpenAI-compatible API endpoint
- Conversation Persistence: Save and restore conversation sessions
- Context Management: Automatic context window tracking and management
- Safety Controls: Workspace boundary enforcement and destructive command confirmations
- Minimal Dependencies: Built with raw
fetchand native Node.js APIs - Configuration Hierarchy: CLI args → env vars → config file → defaults
Quick Start
Using npx (no installation)
npx sage-agentGlobal installation
npm install -g sage-agent
sageFrom source
git clone https://github.com/user/sage-agent.git
cd sage-agent
npm install
npm run devRequirements
- Node.js 18 or higher
- A local LLM server running an OpenAI-compatible API:
- Ollama (recommended for ease of use)
- llama.cpp server
- vLLM
- LM Studio
- Text Generation WebUI (with OpenAI extension)
Configuration
CLI Arguments
sage --base-url http://localhost:8080/v1 \
--model codellama \
--max-tokens 4096 \
--temperature 0.7 \
--no-confirm| Argument | Description | Default |
|----------|-------------|---------|
| --base-url <url> | API base URL | http://localhost:11434/v1 |
| --model <name> | Model name | qwen2.5-coder:14b |
| --max-tokens <n> | Maximum tokens per response | 4096 |
| --temperature <n> | Temperature (0-2) | 0.7 |
| --no-confirm | Skip confirmation for destructive bash commands | false |
| --help | Show help message | - |
Environment Variables
export SAGE_BASE_URL=http://localhost:11434/v1
export SAGE_MODEL=qwen2.5-coder:14b
export SAGE_MAX_TOKENS=4096
export SAGE_TEMPERATURE=0.7
export SAGE_NO_CONFIRM=falseConfig File
Create ~/.config/sage/config.json:
{
"baseUrl": "http://localhost:11434/v1",
"model": "qwen2.5-coder:14b",
"maxTokens": 4096,
"temperature": 0.7,
"noConfirm": false,
"contextWindow": 8192
}Available Tools
Sage provides six essential tools for coding tasks:
| Tool | Description | Parameters |
|------|-------------|------------|
| read | Read file contents with line numbers | file_path, optional offset/limit |
| write | Create or overwrite files | file_path, content |
| edit | Exact string replacement in files | file_path, old_string, new_string |
| bash | Execute shell commands with timeout | command, optional timeout |
| glob | Find files matching patterns | pattern, optional path |
| grep | Search file contents with regex | pattern, optional path/glob/flags |
All file operations are restricted to the current working directory for safety.
Slash Commands
While in the REPL, you can use these special commands:
| Command | Description |
|---------|-------------|
| /save [name] | Save current conversation (auto-generates name if not provided) |
| /load <name> | Load a saved conversation session |
| /sessions | List all saved conversation sessions |
| /clear | Clear conversation history and start fresh |
| /history | Display current conversation history |
| /exit or /quit | Exit Sage |
Supported Model Servers
Sage works with any server that implements the OpenAI Chat Completions API with streaming support:
Ollama (Recommended)
# Install Ollama from https://ollama.ai/
ollama serve # Runs on http://localhost:11434/v1 by default
ollama pull qwen2.5-coder:14b
sagellama.cpp server
# Build llama.cpp and run server
./server -m model.gguf --port 8080 --ctx-size 8192
sage --base-url http://localhost:8080/v1 --model model-namevLLM
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-Coder-14B-Instruct \
--port 8080
sage --base-url http://localhost:8080/v1 --model Qwen/Qwen2.5-Coder-14B-InstructLM Studio
- Start LM Studio and load a model
- Enable "Local Server" in settings (usually http://localhost:1234/v1)
- Run:
sage --base-url http://localhost:1234/v1 --model <model-name>
Model Recommendations
Based on testing, these models work well with Sage:
Best Overall
- Qwen 2.5 Coder 14B (
qwen2.5-coder:14b) — Excellent tool calling, good reasoning - Qwen 2.5 Coder 32B (
qwen2.5-coder:32b) — Best performance if you have VRAM
Good Alternatives
- DeepSeek Coder V2 — Strong coding capabilities, good tool use
- Mistral Codestral — Fast, reliable for coding tasks
- Llama 3.1 8B — Lightweight option for systems with limited resources
Requirements
- Models must support function/tool calling
- Recommended: 8GB+ VRAM for 7B models, 16GB+ for 14B models
Architecture
Sage uses a simple but powerful agent loop pattern:
User prompt → System prompt + history → LLM API (streaming)
→ If tool_calls in response:
Execute tools → Append results → Loop back to LLM
→ If plain text response:
Display to user → Wait for next inputKey Design Decisions
- Raw
fetch+ SSE parsing instead of OpenAI SDK — maximizes compatibility across different server implementations - Readline-based UI instead of React/Ink — simpler, fewer dependencies, sufficient for streaming
- Uniform tool interface — each tool exports
{ name, description, parameters, execute() }for easy extension - CWD-scoped operations — all file operations are restricted to current working directory
- Confirmation prompts — destructive bash commands require user approval (unless
--no-confirm)
Development
Running from source
npm install
npm run dev # Run with tsx (no build needed)
npm run build # Compile TypeScript to dist/
node dist/index.js # Run built versionTesting
npm test # Run all tests
npm run test:watch # Watch modeProject Structure
src/
index.ts # Entry point, CLI parsing, REPL
agent.ts # Core agent loop
client.ts # OpenAI-compatible API client
config.ts # Configuration loading
context-manager.ts # Context window management
history.ts # Session persistence
tools/
index.ts # Tool registry
types.ts # Tool interface
read.ts # Read tool
write.ts # Write tool
edit.ts # Edit tool
bash.ts # Bash tool
glob.ts # Glob tool
grep.ts # Grep tool
safety.ts # Workspace boundaries
ui/
terminal.ts # Streaming output
markdown.ts # Terminal markdown rendering
types.ts # Shared typesExamples
Basic file operations
> Read the package.json file
[Agent uses read tool]
> Add a new script called "lint" that runs eslint
[Agent uses edit tool to modify package.json]
> Create a new file called CHANGELOG.md with initial content
[Agent uses write tool]Code analysis
> Find all TypeScript files in the src directory
[Agent uses glob tool with pattern "src/**/*.ts"]
> Search for all console.log statements
[Agent uses grep tool with pattern "console\\.log"]
> Show me the implementation of the read tool
[Agent uses read tool on src/tools/read.ts]Shell commands
> Run the tests
[Agent uses bash tool to run "npm test"]
> Check git status
[Agent uses bash tool to run "git status"]License
MIT
Contributing
Contributions welcome! Please open an issue or PR on GitHub.
Troubleshooting
Model doesn't support tool calling
Some models don't support function calling. Try models from the recommended list above.
Connection refused
Make sure your local model server is running and the --base-url matches your server's endpoint.
Out of memory errors
Reduce --max-tokens or use a smaller model. The default context window is 8192 tokens.
Tools not executing
Check that the model is actually calling tools (streaming output shows tool calls). Some models need better prompting or don't support tools at all.
Links
- GitHub: https://github.com/user/sage-agent
- npm: https://www.npmjs.com/package/sage-agent
- Issues: https://github.com/user/sage-agent/issues
