ultraagent
v1.1.0
Published
Autonomous AI coding assistant for the terminal — powered by local LLMs on Apple Silicon. 18 built-in tools, auto-setup, project scanner, sub-agents.
Maintainers
Readme
Getting Started · Commands · Tool System · Project Creation · Configuration
What is UltraAgent?
UltraAgent is a terminal-based AI coding assistant that operates autonomously using an integrated tool system. It reads, writes, and edits files, runs shell commands, searches codebases, manages git, spawns sub-agents, persists memory, and browses the web — all controlled by a configurable three-tier permission system.
Runs 100% local on your machine — no API keys, no cloud, no data leaves your computer. Supports Ollama, LM Studio, MLX-LM, and llama.cpp.
Highlights
- 100% local & private — no API keys, no cloud, all data stays on your machine
- Auto-Setup — detects your hardware, backends, and recommends optimal model + settings
- 18 built-in tools — file I/O, shell, git, sub-agents, memory, web, planning
- Project creation wizard — scaffold production-ready projects from scratch
- Project scanner — deep analysis auto-injected into every LLM request
- Three-tier permissions — safe / confirm / dangerous, user approves before risky ops
- Agentic loop — LLM autonomously selects tools, executes, observes, iterates (up to 30 rounds)
- Sub-agents — spawn parallel agents for concurrent tasks
- Persistent memory — context that survives across sessions
- Apple Silicon optimized — curated model recommendations for M-series chips
- Smart caching — LM Studio KV-Cache, Flash Attention, quantized KV for optimal performance
- Real-time streaming — token-by-token output with repetition detection
Getting Started
Quick Start (Auto-Setup)
git clone https://github.com/zurd46/UltraAgent.git
cd UltraAgent
npm install
npm run dev -- local setup # auto-detects hardware, backend, sets optimal config
npm run dev -- chat # start codingThe local setup command automatically:
- Detects your chip, RAM, GPU cores
- Finds running backends (LM Studio, Ollama, MLX-LM, llama.cpp)
- Recommends the best model for your hardware
- Sets optimal context length, KV-cache, batch size, flash attention
- Verifies the connection
Manual Setup
npm run dev -- config set provider local-openai
npm run dev -- config set localModel qwen2.5-coder:14b
npm run dev -- config set localBaseUrl http://localhost:1234/v1
npm run dev -- chatGlobal Installation
npm run build && npm install -g .
ultraagent local setup
ultraagent chatAlias:
uaworks everywhere instead ofultraagent.
Supported Backends
| Backend | Provider | Default Port | Description |
|:--------|:---------|:-------------|:------------|
| LM Studio | local-openai | localhost:1234 | Desktop app with Metal GPU, automatic prompt caching |
| Ollama | ollama | localhost:11434 | Simplest setup, native Metal GPU support |
| MLX-LM | local-openai | localhost:8080 | Apple's ML Framework, fastest on M-chips |
| llama.cpp | local-openai | localhost:8080 | Maximum control, OpenAI-compatible API |
Recommended Models (Apple Silicon)
| RAM | Model | Size | Quality | Speed | |:----|:------|:-----|:--------|:------| | 64GB+ | Llama 3.3 70B (Q4) | 40 GB | Excellent | Slow | | 32GB | Qwen 2.5 Coder 32B (Q4) | 18 GB | Excellent | Medium | | 32GB | Codestral 22B | 13 GB | Excellent | Medium | | 16GB | Qwen 2.5 Coder 14B | 9 GB | Excellent | Fast | | 16GB | DeepSeek Coder V2 16B | 9 GB | Excellent | Fast | | 8GB | Llama 3.1 8B | 4.7 GB | Good | Fast | | 4GB | Qwen 2.5 Coder 3B | 1.9 GB | Basic | Fast |
ultraagent models --ram 32 # show recommendations for your RAMLM Studio Optimal Settings
When using LM Studio, set these in Settings > Server for best performance:
| Setting | Recommended | Why | |:--------|:------------|:----| | Flash Attention | On (not Auto!) | Reduces memory for long contexts | | KV Cache Quantization | Q8_0 (16-32GB) / F16 (64GB+) | Halves KV-cache memory, minimal quality loss | | GPU Offload | Max (all layers) | Full Metal GPU acceleration | | Prompt Caching | Auto | LM Studio caches automatically |
ultraagent local setupsets these values automatically based on your hardware.
Commands
ultraagent <command> [options]Project
| Command | Description |
|:--------|:------------|
| new | Interactive wizard to scaffold a complete new project |
| scan | Deep-scan the project > docs/scan.md (auto-injected into LLM context) |
| create | Create a project from template |
| analyze | Analyze an existing project's structure and codebase |
Agent
| Command | Description |
|:--------|:------------|
| chat | Interactive session with full tool access |
| run <prompt> | Execute a single prompt non-interactively |
| edit | Edit or refactor existing code |
| plan <prompt> | Generate a structured project plan |
| code <prompt> | Generate code from a prompt |
Configuration & Local LLM
| Command | Description |
|:--------|:------------|
| config show | Display current configuration |
| config set <key> <value> | Update a configuration value |
| config setup | Setup wizard (redirects to local setup) |
| config reset | Reset to defaults |
| models | Recommended local models for your hardware |
| local setup | Auto-detect hardware & set optimal config |
| local status | Check local LLM server health + current settings |
| local pull [model] | Download a model via Ollama |
| local list | List installed Ollama models |
Tool System
UltraAgent ships with 18 tools that the LLM invokes autonomously during a session. Each tool has an assigned permission level.
Permission Model
| Level | Behavior | |:------|:---------| | Safe | Executes immediately — no confirmation needed | | Confirm | Requires user approval (file writes, commits) | | Dangerous | Requires approval with highlighted warning (shell commands) |
When prompted: [y]es · [n]o · [a]lways allow · [d]eny always
Tools
| Tool | Permission | Description |
|:-----|:-----------|:------------|
| read_file | Safe | Read file contents with line numbers, offset, limit |
| write_file | Confirm | Create or overwrite files; auto-creates directories |
| edit_file | Confirm | Precise string replacement via diff-based editing |
| bash | Dangerous | Execute shell commands with timeout and output limits |
| glob | Safe | Find files by glob pattern |
| grep | Safe | Search file contents with regex and context lines |
| Tool | Permission | Description |
|:-----|:-----------|:------------|
| git_status | Safe | Working tree status |
| git_diff | Safe | Staged and unstaged changes |
| git_log | Safe | Recent commit history |
| git_commit | Confirm | Create a commit |
| git_branch | Confirm | Create, switch, or list branches |
| Tool | Permission | Description |
|:-----|:-----------|:------------|
| sub_agent | Confirm | Spawn parallel sub-agents for independent tasks |
| memory_save | Safe | Persist context to memory |
| memory_search | Safe | Search persistent memory |
| memory_delete | Safe | Delete a memory entry |
| web_search | Safe | Search the web |
| web_fetch | Safe | Fetch content from a URL |
| plan_create | Safe | Create a structured task plan |
| plan_status | Safe | Show plan progress |
| plan_update | Safe | Update task status |
Project Creation
ultraagent newInteractive wizard that scaffolds complete, production-ready projects.
# Non-interactive
ultraagent new --name my-app --type fullstack --stack nextjs-prisma-pg --prompt "E-commerce platform"| Option | Description |
|:-------|:------------|
| -n, --name <name> | Project name |
| -t, --type <type> | Project type |
| -s, --stack <stack> | Tech stack |
| -d, --dir <path> | Parent directory |
| -p, --prompt <prompt> | Project description |
| Type | Stacks | |:-----|:-------| | webapp | React+Vite, Next.js, Vue 3+Vite, Svelte, Astro | | fullstack | Next.js+Prisma+PG, React+Express, Vue+FastAPI, T3 Stack | | api | Express, Fastify, NestJS, FastAPI, Hono, Go+Gin, Rust+Actix | | cli | TypeScript+Commander, Python+Click, Rust+Clap, Go+Cobra | | library | TypeScript npm, Python PyPI, Rust Crate | | mobile | React Native+Expo, React Native Bare | | desktop | Electron+React, Tauri+React | | monorepo | Turborepo, Nx, pnpm Workspaces | | ai-ml | Python+LangChain, Python+PyTorch, TypeScript+LangChain | | custom | Describe what you need |
- Docker + docker-compose
- CI/CD (GitHub Actions)
- Testing (Unit + Integration)
- Linting + Formatting (ESLint / Prettier)
- Authentication
- Database setup
- API Documentation (OpenAPI / Swagger)
- Environment variables (.env)
- Logging
- Error handling / Monitoring
Output
A complete project with directory structure, config files, typed source code, test setup, README, .gitignore, build scripts, initialized git repo, installed dependencies, and auto-generated docs/scan.md.
Project Scanner
ultraagent scan # scan current project
ultraagent scan --force # force re-scan
ultraagent scan --dir /path/to/projectCreates docs/scan.md — a comprehensive project analysis that is automatically injected into every LLM request, giving the model full project context.
| Section | Content | |:--------|:--------| | Overview | Name, type, language, framework, package manager, file count, LOC | | Git | Branch, remote, last commit | | Directory Structure | Full tree (4 levels) | | Key Files | Entry points with roles | | Dependencies | All deps with versions | | Scripts | npm scripts with commands | | Configuration | tsconfig, eslint, prettier, docker, CI/CD, etc. | | Test Files | All test file paths | | API Routes | Detected route files | | Environment Variables | From .env.example | | Lines of Code | Total and per extension | | Architecture | AI-generated analysis of patterns and conventions |
Cache:
scan.mdis valid for 24 hours. Use--forceto regenerate.
Agent Modes
| Mode | Purpose |
|:-----|:--------|
| chat | General assistance — full tool access, memory, planning |
| create | Project setup — structure, dependencies, config |
| analyze | Code review — architecture, patterns, quality |
| edit | Refactoring — precise edits, minimal changes |
| plan | Task planning — phases, dependencies, risks |
| code | Code generation — typed, tested, convention-aware |
Switch modes in-session with /mode <mode>.
Session Commands
| Command | Description |
|:--------|:------------|
| /help | Available commands |
| /new | Create a new project |
| /scan | Scan project > docs/scan.md |
| /mode <mode> | Switch agent mode |
| /dir <path> | Change working directory |
| /clear | Clear terminal |
| /status | Session status and config |
| /history | Conversation history |
| /undo | Undo last file change |
| /tokens | Token usage |
| /plan | Current plan progress |
| /exit | End session |
Hook System
Automate actions before or after tool calls. Config lives in .ultraagent/hooks.json.
{
"hooks": [
{
"timing": "after",
"tool": "write_file",
"command": "npx prettier --write ${file}",
"enabled": true
},
{
"timing": "after",
"tool": "edit_file",
"command": "npm test",
"enabled": true
}
]
}| Field | Type | Description |
|:------|:-----|:------------|
| timing | before | after | When to run |
| tool | string | Tool name (* for all) |
| command | string | Shell command |
| blocking | boolean | Abort tool call on hook failure (before hooks only) |
| enabled | boolean | Active state |
Variables: ${file} (file path), ${tool} (tool name)
Context Injection
UltraAgent automatically enriches every LLM request with project context:
| Source | Description |
|:-------|:------------|
| Project detection | Language, framework, package manager, scripts |
| docs/scan.md | Full project scan |
| Persistent memory | Saved context from prior sessions |
| Active plan | Task plan with status |
| ULTRAAGENT.md | Project-specific custom instructions |
| Git status | Branch, changes, recent commits |
Priority: Mode prompt > Project context > scan.md > Memories > Plan > History
Custom Instructions
Create ULTRAAGENT.md in your project root:
# Project Instructions
- TypeScript monorepo using pnpm
- Run `pnpm test` after changes
- Use conventional commitsGlobal instructions: ~/.ultraagent/instructions.md
Configuration
ultraagent local setup # auto-detect & configure (recommended)
ultraagent config set provider local-openai
ultraagent config set localModel qwen2.5-coder:14b
ultraagent config show # view config
ultraagent config reset # reset to defaultsEnvironment Variables
ULTRAAGENT_PROVIDER=local-openai
ULTRAAGENT_LOCAL_BASE_URL=http://localhost:1234/v1
ULTRAAGENT_LOCAL_MODEL=qwen2.5-coder:14b
ULTRAAGENT_LOCAL_CONTEXT_LENGTH=32768
ULTRAAGENT_LOCAL_TEMPERATURE=0.7
ULTRAAGENT_LOCAL_GPU_LAYERS=-1
ULTRAAGENT_LOCAL_BATCH_SIZE=512
ULTRAAGENT_LOCAL_FLASH_ATTENTION=true
ULTRAAGENT_LOCAL_KV_CACHE_TYPE=q8_0| Option | Default | Description |
|:-------|:--------|:------------|
| model | qwen2.5-coder:14b | Model identifier |
| provider | local-openai | ollama or local-openai |
| maxTokens | 8192 | Max output tokens per response |
| streaming | true | Real-time token streaming |
| theme | default | default · minimal · verbose |
| history | true | Keep conversation history |
| maxSubAgents | 5 | Max concurrent sub-agents |
| localBaseUrl | (auto) | Local LLM server URL (auto-detected per provider) |
| localModel | — | Local model name |
| localGpuLayers | -1 | GPU layers (-1 = all) |
| localContextLength | 32768 | Context window size |
| localTemperature | 0.7 | Sampling temperature |
| localBatchSize | 512 | Inference batch size |
| localFlashAttention | true | Enable flash attention |
| localKvCacheType | q8_0 | KV-cache quantization (f16, q8_0, q4_0) |
Architecture
src/
├── cli.ts # CLI entry point (Commander.js)
├── index.ts # Programmatic API exports
├── commands/ # Command implementations
│ ├── new.ts # Project creation wizard
│ ├── scan.ts # Project scanner
│ ├── chat.ts # Interactive session
│ ├── edit.ts # Code editing
│ ├── run.ts # Single prompt execution
│ ├── config-cmd.ts # Configuration management
│ ├── local.ts # Local LLM auto-setup
│ └── models.ts # Model recommendations
├── core/ # Core engine
│ ├── agent-factory.ts # Agent loop, streaming, context injection
│ ├── session.ts # Interactive REPL with slash commands
│ ├── context.ts # Git context & project instructions
│ ├── context-manager.ts # Token-aware history management
│ ├── local-llm.ts # Local LLM provider adapters
│ ├── planner.ts # Plan creation & task tracking
│ ├── hooks.ts # Pre/post tool execution hooks
│ ├── token-tracker.ts # Usage tracking
│ └── undo.ts # File change rollback
├── tools/ # Tool system (18 tools)
│ ├── base.ts # Registry, permission levels, definitions
│ ├── permissions.ts # Permission manager & session allowlist
│ ├── read.ts / write.ts / edit.ts / bash.ts / glob.ts / grep.ts
│ ├── git.ts # git_status, git_diff, git_log, git_commit, git_branch
│ ├── sub-agent.ts # Parallel sub-agent execution
│ ├── memory.ts # memory_save, memory_search, memory_delete
│ ├── web.ts # web_search, web_fetch
│ └── index.ts # Barrel exports
├── ui/ # Terminal UI
│ ├── theme.ts # Colors, gradients, ASCII banner
│ ├── spinner.ts # Loading spinners
│ └── permission-prompt.ts # Interactive permission dialog
└── utils/
├── config.ts # Zod-validated config with env support
├── hardware.ts # Apple Silicon hardware detection
├── backend-detect.ts # LLM backend auto-detection
├── model-recommender.ts # RAM-based model recommendation engine
└── files.ts # Project detection & analysisTech Stack
| Category | Technology | |:---------|:-----------| | Language | TypeScript 5.7 (ES Modules) | | AI | LangChain · LangGraph | | LLM Backends | Ollama · LM Studio · MLX-LM · llama.cpp | | CLI | Commander.js · Inquirer.js | | Validation | Zod | | UI | Chalk · Ora · Boxen · Gradient-string · cli-table3 | | Testing | Vitest |
Development
npm run build # compile TypeScript -> dist/
npm run dev # run via tsx (no build)
npm start # run compiled output
npm test # run tests
npm run test:watch # tests in watch mode
npm run clean # remove dist/Contributing
Contributions are welcome. Please open an issue first to discuss what you'd like to change.
- Fork the repository
- Create your branch (
git checkout -b feature/my-feature) - Commit your changes
- Push to the branch
- Open a Pull Request
License
Built by Daniel Zurmuhle
Made in Switzerland
